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Abstract 


We provide the first protocol that solves Byzantine agreement with optimal early stopping (min{/ + 2, t + 1} 
rounds) and optimal resilience (n > 3t) using polynomial message size and computation. 

All previous approaches obtained sub-optimal results and used resolve rules that looked only at the immediate 
children in the EIG (Exponential Information Gathering) tree. At the heart of our solution are new resolve rules that 
look at multiple layers of the EIG tree. 

1 Introduction 

In 1980 Pease, Shostak and Lamport [PSL80, LSP82] introduced the problem of Byzantine agreement, a fundamental 
problem in fault-tolerant distributed computing. In this problem n processes each have some initial value and the 
goal is to have all correct processes decide on some common value. The network is reliable and synchronous. If all 
correct processes start with the same initial value then this must be the common decision value, and otherwise the 
value should either be an initial value of one of the correct processes or some pre-defined default value.' This should 
be done in spite of at most t corrupt processes that can behave arbitrarily (called Byzantine processes). Byzantine 
agreement abstracts one of the core difficulties in distributed computing and secure multi-party computation — that 
of coordinating a joint decision. Pease et al. [PSL80] prove that Byzantine agreement cannot be solved for n < 3t. 
Therefore we say that a protocol that solves Byzantine agreement for n > has optimal resilience. Fisher and Lynch 
[FL82] prove that any protocol that solves Byzantine agreement must have an execution that runs for t + 1 rounds. 
Dolev et al. [DRS90] prove that any protocol must have executions that run for min{/ + 2,t + 1} rounds, where 
/ is the actual number of corrupt processes. Therefore we say that a protocol that solves Byzantine agreement with 
min{/ -f 2, f -f 1} rounds has optimal early stopping. 

The protocol of [PSL80] has optimal resilience and optimal worst case f -f 1 rounds. However the message com¬ 
plexity of their protocol is exponential. Following this result, many have studied the question of obtaining a protocol 
with optimal resilience and optimal worst case rounds that uses only polynomial-sized messages (and computation). 

Dolev and Strong [DS82] obtained the first polynomial protocol with optimal resilience. The problem of obtain¬ 
ing a protocol with optimal resilience, optimal worst case rounds and polynomial-sized messages turned out to be 
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surprisingly challenging. Building on a long sequence of results, Berman and Garay [BG93] presented a protocol 
with optimal worst case rounds and polynomial-sized messages for n > 4t. In an exceptional tour de force, Garay 
and Moses [GM93, GM98], presented a protocol for binary-valued Byzantine agreement obtaining optimal resilience, 
polynomial-sized messages and min{/ -f 5,f -f 1} rounds. We refer the reader to [GM98] for a detailed and full 
account of the related work. Recently Kowalski and Mostefaoui [KM 13] improved the message complexity to 0(n^} 
but their solution does not provide early stopping and requires exponential computation. 

Worst case running of f -f 1 rounds is the best possible if the protocol is to be resilient to an adversary that controls 
t processes. However, in executions where the adversary controls only f < t processes, the optimal worst case can be 
improved to / -f 2 rounds. Berman et al. [BGP92] were the first to obtain optimal resilience and optimal early stopping 
(i.e. min{/ + 2,t + 1} rounds) using exponential size messages. Early stopping is an extremely desirable property 
in real world replication systems. In fact, agreement in a small number of rounds when / = 0 is a core advantage of 
several practical state machine replication protocols (for example [CL99] and [KAD+07] focus on optimizing early 
stopping in the fault free case). 

Somewhat surprisingly, after more than 30 years of research on Byzantine Agreement, the problem of obtaining 
the best of all worlds is still open. There is no protocol with optimal resilience, optimal early stopping and polynomial¬ 
sized message. The conference version of [GM98] claimed to have solved this problem but the journal version only 
proves a min{/ -f 5, f -f 1} round protocol, then says it is possible to obtain a min{/ -f 3, f -f 1} round protocol and 
finally the authors say they believe it should be possible to obtain a min{/ + 2,t + 1} round protocol. We could not 
see how to directly extend the approach of [GM98] to obtain optimal early stopping. The main contribution of this 
paper is solving this long standing open question and providing the optimal min{/ + 2, t + 1} rounds with optimal 
resilience and polynomial complexity. Moreover, our result applies directly for arbitrary initial values and not only to 
binary initial values, as some of the previous results. 

Our Byzantine agreement protocol obtains a stronger notion of multi-valued validity. If u _L is the decision 
value then at least f -f 1 correct processes started with value v. The multi-valued validity property is crucial in our 
solution for early stopping with monitors. This property is also more suitable in proving that Byzantine agreement 
implements an ideal world centralized decider that uses the majority value. We note that several previous solutions 
(in particular [GM98]) are inherently binary and their extension to multi-valued agreement does not have the stronger 
multi-valued validity property. 

Theorem 1. Given n processes, there exists a protocol that solves Byzantine agreement. The protocol is resilient to 
any Byzantine adversary of size t < n/Z. For any such adversary, the total number of bits sent by any correct process 
is polynomial in n and the number of rounds is min{/ -f 2, f -f 1} where f is the actual size of the adversary. 

Overview of our solution. At a high level we follow the framework set by Berman and Garay [BG93]. In this 
framework, if at a given round all processes seem to behave correctly then the protocol stops quickly thereafter. So if 
the adversary wants to cause the protocol to continue for many rounds it must have at least one corrupt process behave 
in a faulty manner in each round. However, behaving in a faulty manner will expose the process and in a few rounds 
the mis-behaving process will become publicly exposed as corrupt. 

This puts the adversary between a rock and a hard place: if too few corrupt processes are publicly exposed then the 
protocol reaches agreement quickly, if too many corrupt process are exposed then a “monitor” framework (also called 
“cloture votes”) that runs in the background causes the protocol to reach agreement in a few rounds. So the only path 
the adversary can take in order to generate a long execution is to publicly expose exactly one corrupt process each 
round. In the t < n/4 case, this type of adversary behavior keeps the communication polynomial. 

For t < njZ'a. central challenge is that a corrupt process can cause communication to grow in round i but will be 
publicly exposed only in round i -f 2. Naively, such a corrupt process may also cause communication to grow both 
in round i and i -\- 1 and this may cause exponential communication blowup. Garay and Moses [GM98] overcome 
this challenge by providing a protocol such that, if there are at most two new corrupt processes in round i and no 
new corrupt process in round i 4- 1 then even though they are publicly exposed in round i 2 they cannot increase 
communication in round i-\-l (also known as preventing “cross corruption”). 

At the core of the binary-valued protocol of Garay and Moses is the property that one value can only be decided on 
even rounds and the other only on odd rounds. This property seems to raise several unsolved challenges for obtaining 
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optimal early stopping. We could not see how to overcome these challenges and obtain optimal early stopping using 
this property. Our approach allows values to be fixed in a way that is indifferent to the parity of the round number (and 
is not restricted to binary values). 

Two key properties of our protocol that makes it quite different from all previous protocols. First, the value of a 
node is determined by the values of its children and grandchildren in the EIG tree ([BNDDS92]). Second, if agreement 
is reached on a node then the value of all its children is changed to be the value of the node. This second property is 
crucial because otherwise even though a node is fixed there could be disagreement about the value of its child. Since 
the value of the parent of the fixed node depends on its children and grandchildren, the disagreement on the grandchild 
may cause disagreement on the parent and this disagreement could propagate to the root. 

The decision to change the value of the children when their parent is fixed is non-trivial. Consider the following 
scenario with a node tr, child ap and grandchild apq: some correct reach agreement that the value of apq is d, then 
some correct reach agreement that the value of ap is d' ^ d and hence the value of apq is changed (colored) to d'. So 
it may happen that some correct decide the value of a based on apq being fixed on d and some other correct decide 
the value of cr based on apq being colored to d'. Making sure that agreement is reached in all such scenarios requires 
us to have a relatively complex set of complementary agreement rules. 

To bound the size of the tree by a polynomial size we prove that the adversary is still between a rock and a hard 
place: roughly speaking there are three cases. If just one new process is publicly exposed in a given round then the tree 
grows mildly (remains polynomial). If three or more new processes are exposed in the same round then this increases 
the size of the tree but can happen at most a constant number of times before a monitor process will cause the protocol 
to stop quickly. 

The remaining case is when exactly two new processes are exposed, then a sequence of (possibly zero) rounds 
where just one new process is exposed in each round, followed by a round where no new process is exposed. This is a 
generalized version of the “cross corruption” case of [GM98] where the adversary does not face increased risk of being 
caught by the monitor process. We prove that in these cases the tree essentially grows mildly (remains polynomial). 

In order to deal with this generalized “cross corruption” we introduce a special resolve rule (SPECIAL-BOT RULE) 
tailored to this scenario. In particular, in some cases we fix the value of a node tr to _L (a special default value) if 
we detect enough support. This solves the generalized “cross corruption” problem but adds significant complications. 
Recall that when we fix a value to a node then we also fix (color) the children of this node with the same value. 

Suppose a process fixes a node a to _L. The risk is that some correct processes may have used a child ap with 
value d but some other correct process will see _L for ap (because when a is fixed to _L we color all its children to 
±). Roughly speaking, we overcome this difficulty by having two resolve rule thresholds. The base is the n — t 
threshold (RESOLVE RULE, IT-TO-RT RULE) and the other is with an — t — 1 threshold (RELAXED RULE). In essence 
this n — t — 1 rule is resilient to disagreement on one child node (that may occur due to coloring). We then make sure 
that the SPECIAL-BOT RULE can indeed change only one child value. This delicate interplay between the resolve rules 
is at the core of our new approach. 

The adversary. Given n > it and </) < f, as in [GM98], we will consider a (f, (f)-adversary - an adversary that 
can control up to (j) corrupt processes that behave arbitrarily and at most t — (j) corrupt processes that are always silent 
(send some default value _L to all processes every round). The {t, </>)-adversary will be useful to model executions in 
which all correct processes have detected beforehand some common set of at least t — (p corrupt processes and hence 
ignore them throughout the protocol. Note that the standard f-adversary is just a (f, f)-adversary. 


2 The EIG structure and rules 

In this section we define the EIG structure and rules. 

Let N be the set of processes, n = |A^| and assume that n > it. Let D be a set of possible decision values. We 
assume some decision _L G I? is the designated default decision. 

Let Er be the set of all sequences of length r of elements of N without repetition. Let Eq = e, the empty sequence. 
Let E = Uo<i<i-i-i Exponential Information Gathering tree (EIG in short) is a tree whose nodes are elements 

in E and whose edges connect each node to the node representing its longest proper prefix. Thus, node e has n children. 
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and a node from Sfc has exactly n — k children. 

We will typically use the Greek letter a to denote a sequence (possibly empty) of labels corresponding to a node 
in an EIG tree. We use the notation aq to denote the node in the EIG tree that corresponds to the child of node tr that 
corresponds to the sequence a concatenated with q € N. We denote by e the root node of the tree that corresponds 
to the empty sequence. Given two sequences a, a' G E, let cr' C tr denote that cr' is a proper prefix of a and ct' C ct 
denote that tj' is a prefix of cr (potentially tr' = tr). 

In the EIG consensus protocol each process maintains a dynamic tree data structure XT'. This data structure maps 
a set of nodes in cr to values in D. Intuitively, this tree contains all the information the process has heard so far. 
Each process z also maintains two global dynamic sets J^, TA. The set T contains processes that z detected as faulty, 
and TA contains processes that z knows are detected by all correct processes. The protocol for updating T, TA is 
straightforward; 

• In each round the processes exchange their T lists and update their T and TA sets once a faulty process appears 
in f + 1 or 2f + 1 lists, respectively. 

• When a process is detected as faulty every correct process masks its future messages to _L. 

The basic EIG protocol will be invoked repeatedly, and several copies of the EIG protocol may be running con¬ 
currently. The accumulated set of faulty processes will be used across all copies (the rest of the variables and data 
structures are local to each EIG invocation). Therefore, we assume that when the protocol is invoked the following 
property holds: 

Property 1. When the protocol is invoked, no correct process appears in the faulty sets of any other correct process. 
Moreover, TAp C Tp and TAp C Tqfor any two correct processes p and q. 

Each invocation of the EIG protocol is tagged with a parameter </>, known to all processes. An EIG protocol with 
parameter f, will run for at most f + 1 rounds. At the beginning of the agreement protocol the faulty sets are empty 
at all correct processes and the EIG protocol with parameter f = t is executed. Each additional invocation of the 
EIG protocol is with a smaller value of </>. In the non-trivial case, when the EIG protocol with parameter f is invoked 
then I TAi \ > t — f. There will be one exception to this assumption, and it is handled in Lemma 1 . Thus, other 
than in that specific case, it is assumed that we have a (f, (/))-adversary during the execution of the EIG protocol with 
parameter f. 

The basic EIG protocol for a correct process z with initial value dz G D is very simple: 

1. Ink: Set XT{e) := dz, so XT{e) is set to be the initial value. 

2. Send: in each round r, 1 < r < t/i+l, for every cr G ITnEr_i, such that z ^ a, send the message {a, z,IT{<j)) 
to every process. 

3. Receive set: in each round r, let <Sr := {era: G Er}. 

4. Receive rnle: in each round r, for all ax G Sr set 

{ 1 if xGT 

d if X ^ T sent (cr, x, d) and d G D] 

XT{a) otherwise. 

Note: assigning of XT{ax) := XT{a) when x T is crucial for the case where x is correct and has halted in the 
previous round. Thus, if a process is silent but is not detected (possibly because it has halted due to early stopping) z 
assigns it the value it heard in the previous round. 

We use a second dynamic EIG tree data structure TZT. Intuitively, if a process puts a value in a node of this tree 
then, essentially, all correct processes will put the same value in the same node in at most 2 more rounds. Processes 
use several rules to close branches of the XT tree whose value in TZTis already determined by all. We present later the 
rules for closing branches of the XT tree. To handle this, we modify lines 2 and 3 as described below (and keep lines 
1 and 4 as above). 

2. Send: in each round r, 1 < r < f+l, for every a G XTCi E^-i, such that zfi a, and the branch a is not closed 
send the message (cr, z,XT{a)) to every process. 
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3. Receive set: in each round r, let Sr = {ax G | branch ax is not closed}. 

Informally, ITz (ap) = d (where ITz denotes the XT tree at process z) indicates that process z received a message 
from process p that said that his value for cr was d. TZTz{ap) = d indicates, essentially, that process z knows that 
every correct process x will agree and have d G TZTx{ap) in at most two more rounds. 

Observe that we record in the EIG tree only information from sequences of nodes that do not contain repetition, 
therefore, not every message a process receives will be recorded. 

At the end of each round, we apply the rules below to determine whether to assign values to nodes in TZT, assigning 
that value in TZTis called resolving the node. 

2.1 The Resolve Rules 

A key feature of our algorithm is that whenever we put a value into TZT{a) we also color (assign) all the descendants 
of cr in TZT with the same value. Observe that this means we may color a node aw in TZT to d even if w is correct and 
sent d' T di-O all other correct processes. 

Rules for IT-to-RT resolve: The following dehnitions and rules cause a node to be resolved based on information 
in XT 

1. lfTT{aw) = d then we say: (1) w is a voter of (cr, w,d)\{2)w is confirmed on (cr, w, d)\ (3) For all v G A^\{cr}, 
u> is a supporter of u on {a,w,d). 

Note: the reason that we count w as a voter, as conhrmed and as a supporter for all its echoers is that due to the 
EIG structure w does not appear in the subtree of aw. 

2. If XT{awv) = d , then we say that u is a supporter of v for (cr, w, d). 

Note: again we need u to be a supporter of itself because of the EIG structure. 

3. If XT{awvu) = d then we say that u is a supporter of v for (cr, w, d). 

4. If there is a set |C/| = n — t, such that for each u' G U, u' is a supporter of v on (cr, w, d) then we say that v is 
confirmed on {a,w,d). 

Note: if a contains no correct and w is correct, then any correct child v (of aw) will indeed have n — t supporters 
for aw and hence will be confirmed. Note that one supporter is w, the other is v and the remaining are all the 
n—f—2 correct children of awv. Also note that u> is confirmed, so all n—f correct will be confirmed on (cr, w, d). 

5. IfuTtv has a set |T| = n — f, such that for each u' € E, u is a supporter of v' on (cr, w, d) and v' is confirmed 
on (cr, w, d) then u is a voter of (cr, w, d). 

Note: this is somewhat similar to the notion of a Voter in grade-cast ([FM97, FM88]). But there is a crucial 
difference: all the n — t echoers need to be confirmed. Also note that w is a voter for itself. 

6. IT-TO-RT RULE: If w has a set I [/ 1 = n — t, such that for each u' G U, u' is a voter of (cr, w, d) then if aw ^ TZT, 
then put TZT{aw) := d and color descendants of aw with d as well. 

Note: this is somewhat similar to the notion of a grade 2 in grade-cast. A crucial difference is that the n — t 
voters needed are dehned with respect to supported echoers. This is a non-trivial change that breaks the standard 
grade-cast properties. Also note that we not only put a value in aw but also color all the descendants. 

7. ROUND fi+T RULE: if XT{aw) = d and cr € Ej then if aw ^ TZT, then put TZT{aw) := d. 

Note: this is a standard rule to deal with the last round. 

Rnles for TZT tree resolve: The following definitions and rules cause a node to be resolved based only on infor¬ 
mation in TZT (these rules do not look at XT)- 
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1. If there is a set |? 7 | = t + 1, such that for each u' S U, TZT(awvu') = d then we say v is 7?.7^confirmed on 
{a,w,d). 

Note: if any correct sees a node as confirmed then it has n — t that echo its value. At least f + 1 of them are 
correct and they all cause all correct to see the node as 72.T-confirmed. Of course a node may become TZT- 
confirmed even if it was never confirmed by any correct. Observe that if TZT{(Jwu) = d then, by coloring, u is 
7?^7^confirmed on (cr, lu, d). 

2. Ifu^w has a set |y| = n — t, such that each v' &V is 7?.T-confirmed on (cr, in, d) and for each v' G V \ {u}, 
TZT{crwv'u) = d and if u G V then also TlT{awu) = d, then u is 7?.T-voter of (cr, w,d). 

Note: if any correct process sees a node as a voter then it has n — t echoers that are confirmed. So each of these 
n — t echoers will be 7?.7^confirmed. So all correct processes will see this node as 7?.7^voter. Of course a node 
can become 7?.7^voter even if it was never a voter at any correct process. 

3. RESOLVE RULE; If w has a set \U\ = t + 1, such that for each u' G U, v! is a T^T^voter of (cr, w, d) then if 
aw ^ TZT, then put TZT{aw) := d, and color descendants of aw with d as well. The rule applies also for node 
aw = e. 

Note: if any correct process does IT-TO-RT RULE then this rule tries to guarantee that all correct processes will 
also put this node in TZT- The problem is that SPECIAL-BOT RULE (see below) may be applied to one of the 
echoers and this may cause some of the T^T^voters to lose their required support. The following rule fixes this 
situation. It reduces the threshold to n — t — 1 but requires that all children nodes are fixed. 

4. RELAXED RULE: If all the children of aw are in TZT{i.e.,'dawv G S; awv G TZT and exists a set \ V\ = n—t—1, 
such that for each v' G V, TZT{awv') = d, then if aw ^ TZT, then put TZT{aw) := d, and color descendants of 
aw with d as well. The rule applies only for nodes |crw| > 1. 

Note: as mentioned above, the RELAXED RULE requires a threshold of n — f — 1 so that it can take into account 
the possibility of one value changing to _L due to the following rule; 

5. SPECIAL-BOT RULE; If there is a set |1/| = t + 2 — \awu\ such that for all v G V, TZT{awuv) = ± and for all 
u' ^ u such that awu' G S, awu' G TZT then if awu ^ TZT, then put TZT{awu) := _L, and color descendants 
of awu with _L as well. The rule applies only for \awu\ > 2. 

Note: This rule can be applied to at most one child. 

6. SPECIAL-ROOT-BOT RULE: If exists a set |[/| = t + 1 such that for each u G U, TZT(u) = ± then if e ^ TZT, 
then put TZT{e) := -L, and color descendants of e with _L as well. 

Note: this rule is important in order to stop quickly if f + 1 correct processes start with the value _L. 

To prevent the data structures from expanding too much processes close branches of the tree, and from that point 
on they do not send messages related to the closed branches. We use the notation {a G 7?.7|r]} to denote an indicator 
variable that equals true if TZT{a) was assigned some value by the end of round r, and false otherwise. 

Branch Closing and Early Resolve rules: There are three rules to close a branch in IT two of them also trigger 
an early resolve. By the end of round r, r < cj), 

1. DECAY RULE: if zJcr' C a such that a' G 72.7^r — 1], then close the branch a G IT 

Note: this is the simple case; if a process already fixed the value of cr' in TZTin round r — 1 then it stops in the 
end of round r, since by the end of round r + 1 all correct processes will put a' in TZT (and will interpret this 
process’s silence in the right way during round r + 1). There is no need to continue. Coloring will fix all the 
values of this subtree. 

2. EARLY_lT-TO-RT RULE: if a G Sr -1 and exists U C N, U fi{u' | it' € cr} = 0, |[/| = n —r, such that for every 
u,v G U \T, IT{au) = IT{av), then if cr ^ TZT, then put TZT{a) := IT{a) and close the branch cr G IT- 
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Note: this is a case where the process can forecast that all correct processes will put a in TZTin the next round 
(because the process sees that all children nodes agree). So the process can fix cr in this round and stop now, be¬ 
cause all correct processes will hx a in TZT next round (and will interpret this process’s silence in the right way). 

3. STRONG_lT-TO-RT RULE: if CT S Sr -2 and exists U c N, U n {u' \ u' € a} = ^, \U\ = n - r + I such that 
for every u,v € U \ where v ^ IT{o'uv) = XT{(Jvu) then, if cr ^ TZT, then put TZT{cr) := XT{a) and 
close the branch a € XT 

Note: in this case all the correct children of a except for at most one will be fixed in the next round to the same 
value, so the RELAXED RULE will be applied to cr in the next round. So we can hx cr in this round and stop now. 


In each round all the above rules are applied repeatedly until none holds any more. 

The rules above imply that there are two ways to give a value to a node in TZT One is assigning it a value using 
the various rules, and the other is coloring it as a result of assigning a value to one of its predecessors. We will use the 
term color for the second one and the term put for the hrst one. 

Rules for fault detection and masking: The following dehnitions and rules are used to detect faulty processes, 
put them into T and hence mask them (all messages from T are masked to _L). The last rule also dehnes an additional 
masking. The process hrst updates its T and TA sets using the sets received from the other processes during the 
current round. A process is added to T or TA once it appears in f -t- 1 or 2f + 1 sets, respectively. Next the process 
applies the following fault detection rules. The fault detection is executed before applying any of the resolve rules 
above. When a new process is added to T, the new masking is applied and the fault detection is repeated until no new 
process can be added. Only then the resolve rules above are applied. 

At process z by the end of round r: 

1. Not Voter: If 3aw € Sr-i and w ^ z and ^cr' Cl aw such that a' S TZT and it is not the case that there exists 
a set |( 7 | = n — t — 1 such that for each u' € U, XT{awu') = XT{aw) then add w to T. 

Note: this is the standard detection rule after one round - if anything looks suspicions then detect. 

2. Not IT-to-RT: If 3aw € Sr -2 for which w does not have a set \U\ = n — t, such that for each u' G U, u' is a 
voter of (cr, w, d), and ^a' C a such that a' G TZT then add w to T. 

Note: this is the standard detection rule after two rounds - if anything looks suspicions then detect. 

3. If u,u T w, has a set \V\ = n — t, such that for each u' G V, it is a supporter of v' on (cr, w, d) then we say 
that u is an unconfirmed voter of (cr, w,d). 

Note: the notion of an unconfirmed voter is exactly that of a voter in the standard grade-cast protocol. 

4. If w has a set \U\ = t + 1, such that for each u' G U, u' is an unconhrmed voter of (cr, w, d) then we say that 
aw is leaning towards d. 

Note: the notion of leaning towards is exactly that of getting grade > 1 in the standard grade-cast protocol. 

5. Not Masking: If aw G is leaning towards d and there exists u, jV | = t+1, and d' d such that for each 
v' G V, XT{awuv') = d' and there exists |cr"| > |cr| such that IT(cr"i(;it) _L then 

(a) XT{a''wu) = _L; 

(b) if by the end of the round ^a' C a''w such that a' G TZT then add u to T. 

Note: If aw is leaning towards d then u must have heard at least t+1 say d on aw. If t -f 1 say u said d' then u 
must have said d' to some correct. So u must have received d' from aw but in the next round u hears f -f 1 say 
aw said d. So u must conclude that w is faulty and u must mask him from the next round. If u did not mask 
some a"wu then the Not Masking rule will detect u as faulty and mask all such a"wu for you and also mark 
you as faulty. The reason we wait until the end of the round to add that node to T is that it might be a node of 
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a correct process that stopped in the previous round and hence did not send any messages in the current round, 
and therefore did not send masking. In such a case we mask its virtual sending, but do not add it to T. 

Finalized Output: By the end of each round (after applying all the resolve rules), the process checks whether there 
is a frontier in VJT. A frontier (also called a cut) is said to exist if for all a there exists some sub-sequence 

cr' □ (T such that a' S TZT- 

1. Early Output rule: By the end of a round, if e e TZT, output TZT{e). 

2. Final Output rule; Otherwise, if there is a frontier, output _L. 

Observe that the existence of a frontier can be tested from the current ITin 0{\IT\) time. 

Stopping rule: If all branches of IT are closed, stop the protocol. 

3 The Consensus Protocol Analysis 

The EIG protocol implicitly presented in the previous section is a consensus protocol V^, where (j>, 1 < f < t is 
a parameter. Protocol runs for at most (j> + I rounds and solves Byzantine agreement against a {t, (/>)-adversary. 
Denote by G the set of correct processes, \G\ > n — t, where n = |A^|, and by S', S' = PlgeG -^ 9 ’ processes 

that are masked to _L by all correct processes. Let s := |S|. 

Our solution invokes several copies of the EIG protocol. Eor each invoked protocol, V^, there are two cases: either 
s > t — (p, OT we are guaranteed that the input of all correct processes that start the protocol is the same (in particular, 
it may be that some correct processes have halted and do not start the protocol). The following lemma deals with this 
latter case. 

Lemma 1. [Validity and Fast Termination] For any (t, t)-adversary, and n > 3f -f 1, 

7. if every correct process that starts the protocol holds the same input value d then d is the output value of all 
correct processes that start the protocol, by the end of round 2, and all of them complete the protocol by the end 
of round 3. 

2. if all correct processes start the protocol and t + 1 correct processes start with _L then all correct processes 
output _L by the end of round 3 and stop the protocol by the end of round 4. 

3. For p, q £ G, no p will add q to Tp in either of the above cases. 

Proof To prove the first item, let us follow the protocol. Let Gi be the set of correct processes that start the protocol 
and let G 2 = G \ Gi, be the remaining correct processes that remain silent throughout the protocol. 

Initially, for every z € Gi, XTz(e) = dz. 

In the 1st round every correct process z G Gi sends (e, z, dz) to every process. By the end of the 1st round, 
every correct process applies the receive rule for all the other processes. Thus, every correct process z £ Gi has 
ITz{x) := dx, for every x € G, since it completes the missing values from correct processes in G 2 to be its own 
input value. Thus, the receive rule assigns at each z G Gi, TTzio'x) := ITzip) for a missing value by a; G G 2 for a. 
EARLY_lT-TO-RT RULE, may be applied by some correct processes at the end of the first round, and as a result will put 
TlT{e) = d and will output d. 

Since ITz{x) = d for all x G G, by the end of the 1st round, every z G Gi sees every x G G as supporter of x for 

(e, G d). 

In the 2nd round, every correct process z G Gi that did not apply EARLY_lT-TO-RT RULE by the end of the 1st 
round, sends (x, z, d) for every process x G G to every process. Again, if any correct process did not send a message, 
its missing value for any x G G will be assigned the same value at all correct processes. Notice that some additional 
correct processes may not send in the second round. 

By the end of the 2nd round, after applying the receive rule, at each z G Gi that did not apply EARLY_iT-TO-RT 
RULE by the end of the 1st round, ITz(xy) = d for every x,y € G. Thus, for every such x, every y € G \ {x}, is a 
supporter of x for (e, e, d). As a result, for the set \G\ = n — t, for each u' G G, u' is a supporter of v for (e, e, d), for 


every v G G. Therefore, every v G G is confirmed on (e, e, d). Therefore, every process z G Gi, that did not apply 
EARLY_lT-TO-RT RULE by the end of the 1st round, sees every process x S G as a voter of (e, e, d). This implies that 
it can apply the IT-TO-RT RULE and will put TZT{e) = d, will output d, and will stop the protocol by the end of round 

3. 

For the second claim: by the end of the 1st round, every correct process 2 has ITzix) := -L, for at least t + 1 
processes x G G. Let A = {x\xGGhdx = -L}. If L was the input value to all correct processes, we are done by 
the previous claim. Otherwise, no correct process will apply EARLY_lT-TO-RT RULE to a value that is not _L. 

In the 2nd round, every correct process z sends (x, z, d^) for every process x G G io every process. By the end of 
the 2nd round, after applying the receive rule, at each z G G, 2Tz{xy) = d^, for every x,y G G. Thus, every process 
z G G sees each process v G A both as supporter of v for (e, v, _L), and also as supporter of u for (e, u, _L) for every 
uGG. 

In the 3rd round, every correct process z sends (ux, z, _L) for every process v G A and x € G to every process. 
If any correct process applied EARLY_iT-TO-RT RULE in the previous round, then its missing value regarding other 
correct processes will be identical at all correct processes. By the end of the 3rd round, after applying the receive 
rule, at each z G G, XTziyxy) = _L, for every v G A, and x,y G G. Thus, every process z G G sees every process 
u G G \ {u} as a supporter of u' for (e, v, _L), for every u' G G \ {u} and v G A. For every such v and u' , v is also a 
supporter of u' for (e, v, _L). Thus, every such u' is confirmed on (e, u, _L), for every v G A. Moreover, by definition, 
every such v is also confirmed on (e, u, _L). 

As a result, every process z G G sees every process rt £ G as a voter to (e, v, _L), for every v G A. Thus, v G TZTz 
for every v G A. Thus, it can apply the SPECIAL-ROOT-BOT RULE and will put TZT{e) = _L by the end of round 3, 
and will stop by the end of round 4. 

To prove the 3rd claim, observe that the fault detection rules can be applied only in rounds 2 or 3. If a correct 
process did not send any message in round 2, it is because of applying EARLY_lT-TO-RT RULE, and it’s missing values 
will not cause any other correct process to be suspected as a faulty process, neither the correct process that did not 
send. By the end of the 2nd round e will be in TZTq for every q G G, and no one will apply any fault detection rules 
anymore. 

If all correct processes participated in round 2, then Not-Voter will not apply to any correct process. If any correct 
process did not send any message in round 3, it’s missing values will not harm any correct process or itself and all 
correct processes will be in TZThy the end of the round. For similar reasons, the Not-Masking rule will not cause any 
correct process to be added to 2. □ 

The only case in which not all correct processes invoke a V^j, protocol is when some of the background running 
monitors are being invoked by some of the correct processes, while others may have already stopped. This special case 
is guaranteed to be when the inputs of all participating correct processes is _L, and consensus can be still be achieved. 
Lemma 1 implies the following: 

Corollary 1. For any {t, t)-adversary, and n > 3f + 1, if every correct process that invokes the protocol start with 
input _L, then _L is the output value at each participating correct process by the end of round 2, and each participating 
correct process completes the protocol by the end of round 3. Moreover, for p,q G G, no p will add q to iFp. 

The gossip exchange among correct processes about identified faults ensures the following: 

Lemma 2. For a {t, (j))-adversary and protocol n > + 1, assuming Property 1, for any k, \ < k < f by 

the end of round k, for every two correct processes p, q, TAp C 3Fq and TAp [fc — 1] L XAp [A:]. 

Proof Prior to invocation the claim holds by Property 1. In each round processes exchange their F sets. If a process 
finds out that some process b appears in the lists of at least f + 1 processes it adds b to F, and if it appears in 2f + 1 
lists it adds it to both F and FA. The F and FA sets are never decreased, and FA is updated only through gossiping. 
Therefore, it is easy to see that by the end of each round the claim holds. □ 

A node may initially assign a value using one of the “put” rules and later it may color it to a different value. In 
the arguments below we sometimes need to refer to the value that was put to a node rather than the value it might be 
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colored to. Once a node has a value it is not assigned a value using any put rule any more. Thus, the value assigned 
using a put rule is an initial value that may be assigned to a node before it is colored, or that node may never have a 
value put to it. To focus on these put operations, we will add, for proof purposes, that whenever a node p uses a put 
rule for some a, except ROUND (j) + 1 RULE, it also puts cr in VTp (The “Put-Tree”) and as a result at that moment, 
'PTp{<j) = TZTp{a). We do not color nodes in VTp, thus for cr that is colored, but was not assigned a value prior to 
that, 'PTp{cr) is undefined. We exclude ROUND (j)+l RULE from VT on purpose. 

The following is the core statement of the technical properties of the protocol. The only way we found to prove all 
these is via an induction argument that proves all properties together. The theorem contains four items. 

The detection part proves that correct processes are never suspected as faulty. The challenge is that the various 
rules instruct processes when to stop sending messages, and that might cause other correct processes to be suspected 
as faulty. 

The validity part proves that if a correct process sends a value, it will reach the TZT of every other correct process 
within two rounds. It also proves that if a correct process decides not to send a value (thus, closed a branch), the 
appropriate node will be in TZT of every correct process. The third claim in the validity part is that if a process appears 
in FA, then it appears in TZT of every correct process within two rounds. 

The safety part intends to prove consistency in the TZT- The challenge is that coloring may cause the trees of 
correct processes to defer. Therefore the careful statements looks at VT, and which rule was used in order to assign 
the value to it. The _L value is a default value, therefore there is a special consideration of whether the value the process 
puts is _L or not. The end result is that if a node appears in VT of two correct processes, it carries the same value. 

The liveness part shows that if a node appears in TZT of a correct process, it will appear in TZT of any other correct 
process within two rounds. 

Theorem 2. For a [t, <f}-adversary and protocol TT^, n > 3f + 1, assuming Property 1 and that all correct processes 
participate in the protocol, then for any 1 < k < f + 1 : 

1. No False Detection: For p, q G G, no q will add p to Fq in round k. 

2. Validity: 

(a) For a G Sfe _3 if p G G, sends {(J,p, dp), then at the end of round k, at every correct process x, either 
TTTx (cp) = dp or 3a' \Z a such that a' G TZTx- For k = f+l, the property holds also for any a G Sfc _2 
and for any a G Sfc_i. 

(b) If z G FA in the beginning of round k — 2, then by the end of round k, at every correct process, either 
TZT{az) = _L or zlcr' C a such that a' G TZT For k = f + 1, the property holds for z G FA in the 
beginning of rounds k — 1 or k. 

(c) For a G ifp G G, does not send {a, p, d) for any d G D, then at the end of round k, at every correct 

process x, 3a' \Z a such that a' G TZTx- 

3. Safety: Forp,q G G, x G N, \ax\ < (j),ax G VTp[k], then 

(a) if p applies RESOLVE RULE to put VTp{ax) = d, d T T and v is one of the TZT-confirmed nodes 
on {a,x, d) in TZTp used in applying this rule in TZTp, and in addition VTq{axv) = _L, then q applied 
SPECIAL-BOT RULE to put axv; 

(b) if \ax\ > 1 and VTp{ax) = d, d T, then, by the end of round k, \Vq\ < t, where Vq = {u \ 
VTq{axu) = _L}; 

(c) if \ax\ > 1 and VTp{ax) = _L and it wasn’t put using SPECIAL-BOT RULE, then, by the end of round k, 
\Vq\ < t, where Vq = {u \ VTq{axu) T -L}/ 

{d) if ax G VT(ff3\, then VTp{ax) = VTq{ax). 

4. Liveness: Forp, q G G, if a e TZTp[k — 2] then a G TZTq[k]. For k = f+l, if a G TZTp then a G TZTq. 

Proof of Theorem 2. We prove the theorem by induction on k. We first prove the theorem assuming f > 1 and will 
conclude by proving the theorem for the case f = 1. 

As the proof is quite complex, we split it into three ranges, k = 1, k < (j> — 1, and k < f +1. We will prove the 
following claims, where each handles the appropriate range; 
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Claim 1. Theorem 2 holds for k = 1. 

The general case. This is where most of the technical challenge lies: 

Claim 2. Theorem 2 holds for \ < k < f — 1. 

The final two rounds, when the resolve rules are slightly different: 

Claim 3. Theorem 2 holds for f — l<k<(j)+l. 

Proof of Claim 1. We will prove each of the four items separately. 

Proof of Item 1 for Claim 1. (Detection) By the end of round 1, a process may add another to T only through gossip¬ 
ing. Property 1 implies that no correct process will suspect any other correct process. The rest of the fault detection 
rules are not applicable in the first round. □ 

Proof of Item 2 for Claim 1. (Validity) For k = 1, Statement 2c and Statement 2b vacuously hold, since there was 
no such round. For proving Statement 2a observe that in the first round only EARLY_lT-TO-RT RULE is applicable. 
Assume that a correct process p € G applies EARLY_IT-T0-RT RULE by the end of round 1, thus node a is e, since 
a = e. This implies that for every x € N \ Tp, ITp{x) = d. Since C? fl = 0, we conclude that for every correct 
process q, dq = d, and by the end of the 2nd round, by Lemma 1, all correct processes will have TZT{e) = d = dp and 
we are done. □ 

Proof of Item 3 for Claim 1. (Safety) Only Statement 3d is applicable for fc = 1. Notice that the only case in which 
ax € VTp [1] is when p applies EARLY_lT-TO-RT RULE in the end of the 1 st round and as a result puts some value d to 
the root node in its TZTp. In such a case, it is clear that if a € then VTp{a) = VTq{a). □ 

Proof of Item 4 for Claim 1. (Liveness) This item vacuously holds. □ 

This completes the proof of Claim 1 . □ 

Now we move to proving the main part of the theorem. 

Proof of Claim 2. The proof is by induction. The base case is Claim 1. Assume correctness for any k", 1 < k" < k, 
and we will prove the claim for k, k < f — 1. 

Proof of Item 1 for Claim 2. (Detection) The fault detection takes place in every round before any resolve rule is 
applied. By induction we know that a correct process will not add another correct process to T using gossiping 
from other processes. The three rules to add a process to T are based on the messages accumulated in XT) The 
induction on /c — 1 allows us to determine what messages correct processes will be sending in round k. 

Let round k be the first round at which a process p is not sending messages related to the branch of cr. There 
are three cases in which a correct process, p, stops sending, by using DECAY RULE, EARLY_lT-TO-RT RULE and 
STRONG_lT-TO-RT RULE. If p closes the branch of a at the end of round fc — 1 and is not sending messages related to 
it in round fc, the receive rule instructs correct processes what values to add to their IT) 

Let’s consider the three fault detection rules. Not-Voter is not applicable, since in the previous round p sent its 
messages appropriately. Since p is correct every correct process that sends messages echo’s the message it sent, and 
whenever a correct process applies the receiving rule to assign messages to processes that did not send messages in 
the current round it adds the message p originally sent. For the similar reason Not-IT-to-RT is not applicable. 

The last fault detection rule is Not-Masking. Assume that a correct process q is expecting process p to mask away 
some process w. The Not-Masking rule allows q to mask the non-sending by _L, but q will not add p to I" if by the 
end of the round q will have 3a' C a"w such that a' S 7TT) Thus, p will not be in T during the processing of all the 
rules below. Statement 2c that is proved next guarantees that also by the end of the round a correct process p will not 
be added to T. □ 
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Proof of Item 2 for Claim 2. (Validity) 

For Statement 2a, the case of fc = (^ + 1 is excluded for now. Assume that p sends (cr, p, dp) in round fc — 2. If any 
correct process x is not sending a message (cr, x, dx), then by the protocol it should have set either a' € VTx\k — 4] 
(if used DECAY RULE) or cr' e VTxik - 3] (if used EARLY_IT-TO-RT RULE or STRONG_IT-TO-RT RULE) for some 
cr' IZ a, and we are done by induction (Statement 3d). If there is a correct node x G a, then the claim holds by 
induction (Statement 2a). So we are left with the case that no correct node appears in cr and all correct processes are 
participating in round fc — 2. By the end of round k — 2 every correct process x will apply the receiving rule and will 
haveITa:(crp) = dp. 

If any correct, x (x f p), doesn’t send a message (crp, x, dp) then we are done by induction, using similar argument 
as above. Therefore, by the end of round k—1, every correct process will apply the receiving rule and will have n—t—1 
children nodes for p in its IT- Thus, by the end of round k — 1, for every x G G, at every y G G, x is a supporter of x 
for {a,p, dp). And for every X, y G C?, where x fyf v,ITx{crpy) = dp. In round/c some correct process (including 
p) may not send messages and all the rest will send identical value dp messages. The above implies that the receiving 
rule will assign to each correct process that does not send messages the identical value d at every correct process that 
still process messages for this branch. 

As we argued before, since p itself is confirmed on each node it echoes, every correct process will be a voter and 
therefore, by the end of round k, at every correct process x G G, that still process messages for this branch either 
TZTx{crp) = dp, or 3a' C ap such that a' € TZTx. 

The proof of Statement 2b is identical to the above, as if it is the case of a correct process sending _L. 

Proving Statement 2c: Let cr G Sfc_i and p G G.lfp does not send any message {a,p, d) for any d G Din round 
k, then either the branch was closed earlier and we are done by induction, or this is the first round any correct process 
doesn’t send a message on this branch. Thus, p applied DECAY RULE, EARLY_lT-TO-RT RULE or STRONG_lT-TO-RT 
RULE by the end of round k — 1. 

We will cover each of the closing rules separately. 

Proving the claim in case p G G uses DECAY RULE: by definition 3a' C a such that a' G TZTp[k — 2], which 
results in closing the branch by the end of round k—1 and not sending in round k. If 3a" C a, a" G TZTp[k — 3], 
then we are done by induction. Otherwise, it must be because of messages received in round k — 2. All such messages 
are reflected in ITp. To influence a a' G TZTp, it should be as a result of applying IT-TO-RT RULE, ROUND </> + 1 
RULE, EARLY_lT-TO-RT RULE, or STRONG_lT-TO-RT RULE. Since fc - 2 ^ (/) + 1, we conclude that it is not a result 
of applying ROUND f + 1 RULE. If it is a result of p applying EARLY_IT-TO-RT RULE, or STRONG_IT-TO-RT RULE 
in round k — 2 then this branch would be closed already be the end of round k — 1 and we are done by induction. 
Similarly, if any other correct process closed the branch by the end of round fc — 2, we are done by induction. 

Assume now the case that it is a result of p’s using IT-TO-RT RULE. Thus, there should be some aw, such that 
a' C a, aw G Sk -4 and p applied IT-TO-RT RULE in round fc — 2 to put it in TZTf (and VTf, for proof purposes). Let 
d be the value assigned by p to VTpiaw) as a result of processing XTp by the end of round k — 2.. If there is a correct 
node in aw, we are done by induction. Since this is not the case, then when p applied IT-TO-RT RULE it observed a set 
U of n — t processes in XTp that are voters of [a, w, d), of which at least t + 1 are correct processes. Let U be the set 
of correct voters in U. 

For each voter v G U there is a set of Wy of n — t processes that are confirmed on {a, w, d), where u is a supporter 
to each u G Wy on [a, w, d). Since we assume that there is no correct nodes in aw, v f w. 

By definition, for each u G Wy \ {w, u}, XTp{dwuv) = d, and since v G G and no correct process closed the 
branch or stopped sending yet, then by the end of round k — 2, for every x G G \ {u, u}, XTx (dwuv) = d. If v G Wy, 
then all will also have XTx{dwv) = d. 

For u G G, since aw G Efc- 4 , by induction, TZTxidwuv) = d, or 3a' \Z dwuv such that a' G TZTx, at every 
xGG. 

For u ^ G, by the end of round k — 1, for every x G G, at every y G G,x is a supporter of x for (dwu, v, d). And 
for every x,y G G, where x T V v, XTx (dwuvy) = d. In round k some correct process (including p) may not send 
messages and all the rest will send identical value d messages. The above implies that the receiving rule will assign 
to each correct process that does not send messages the identical value d at every correct process that still process 
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messages for this branch. 

As we argued before, since v itself is confirmed on each node it echoes, every correct node will be a voters and 
therefore, by the end of round k, at every correct process x € G, that still process messages for this branch, and for 
every u G Wy, either TZTx{^wuv) = d, or Ba'd dwuv such that a' G TZTx- 

Now observe that each u G Wy, being confirmed on (tf, w, d), has a set Uy of n — t of supporters in ITp of u 
for (if, ru, d) (one of which is u itself). Let be the set of correct processes in [/„. By definition, for each u G Wy, 
ITpidwu) = d and for each u' € Uu \ {«}, XTp{d'wuu') = d. Since no correct process closed the brach or stopped 
sending, at every x € G, ITx{dwuu') = d, and if it G G, then XTxidwu) = d. Thus, by the end of round k, at every 
correct process x G G \ {it, it'}, that still process messages for this branch, VTxidwuu’) = d, where it G Wy, and 
u' € Uu\ {«}■ Thus, each it G Wy, is T^T^confirmed on (if, w, d) and each v G U is TZT-yoX.es: on (if, w, d). The same 
holds, by definition, for it and it' if they did not closed the branch earlier. This implies that such x will apply RESOLVE 
RULE to assign 'PTx (if lu) = d (or would observe by that time 3a' C dw such that tr' G TZTx), which completes the 
proof for this case. 

Proving the claim in case p G G uses EARLY_iT-TO-RT RULE: Assume that a correct process p G G applies 
EARLY_lT-TO-RT RULE by the end of round k — 1. Let a G Sk -2 and denote a = tu. The assumption of p’s 
closing the branch implies, among other things, that for every x,y G N \ Tp, such that tux, ruy G XTp{Tux) = 
XTpiruy) = d, for some d G D, and thus 'PTp{Tu) = d. This also implies that every correct process x that applies 
the receiving rule in round k will assign XTx {crp) = XTx (cr) = d. If there is any correct process in r, we are done by 
induction (Statement 2a on fc — 1, since the correct processes sent in fc — 3 or earlier). If this is not the case, whether u 
is correct or not, we conclude that by the end of round k — 1 every correct process x G G will have XTx {juy) = d for 
every y G G \ {it, a:}, and if it G G then also XTx{tu) = d. This is true since by Lemma 2, and Item 1, G fl = 0. 
Thus, by the end of round k every correct process x that did not close the branch will use IT-TO-RT RULE to obtain 
TZTx{tu) = d (or would observe by that time 3a' C tu such that a' G TZTx), and we are done. 

Proving the claim in case p G G uses STRONG_lT-TO-RT RULE: Assume that p applies STRONG_lT-TO-RT RULE by 
the end of round k—1. If there is a correct process in a, we are done by induction. If 3a' d a such that a' G TZTq [fc — 1] 
for any correct q, we are also done. Otherwise, let a G Efe- 3 . By definition there exists U,Uria = 0, \U\ = n — r + 2 
such that for every u,v G U \ T, where v ^ u, XTp{auv) = XTp{avu). Let x be the node such that ax G E, but 
X ^ U d T. Since we assume that there is no correct process in a, G Q U U {x}. Assume first that x is not correct. 
If this is the case, then the assumption on U implies that all members of U are supporters and voters and by the end 
of round k — 1, a would be in TZT of every correct process. If this is not the case, we are left with the option that x is 
correct but doesn’t agree with some of the values all members of U sent. Denote by U the correct member of U, and it 
is clear that \U\ = n — t—1 and |?7| > n — t. The definition of the set U implies that by the end of round k—1, either 
p puts a in 'PTp, or 3a' d a such that a' G TZTp. Moreover, by induction, for every member it of U, au G TZTq of 
every correct process q by the end of round k. Thus, by the end of round k every correct process q that doesn’t already 
have tr G TZTq will be able to apply RELAXED RULE to put a G TZTq, and we are done. □ 

Proof of Item 3 for Claim 2. (Safety) Notice that when a process p puts a value to a node ax, say in round k, then at 
that point in time ^cr' IZ a, such that a' G TZTp [fc]. 

Observe that if both p and q put values to ax prior to round k, then the claims hold by induction on k. Therefore 
we limit ourselves to nodes which value q puts in its 'PT in round k and p had put a value to that node in its TT in 
some round k' < k. Moreover, we limit ourselves to the case where no correct process had put a value to that node in 
its “PTin any round k" < k'. 

We prove Item 3 by backward induction on the length ^ = \ax\ from i = kXo 1. For each £ we will go through 
all the put rules p could have applied in setting the value to ax in round k or earlier, and for each rule we consider the 
relevant rules q could have apply, and we will prove that the four statements hold in each case. 

The rules to put a value to a node in TZT (and "PT) are: 1) IT-TO-RT RULE, 2) RESOLVE RULE, 3) RELAXED RULE, 
4) SPECIAL-BOT RULE, 5) SPECIAL-ROOT-BOT RULE, 6 ) EARLY_IT-TO-RT RULE, 7) STRONG_IT-TO-RT RULE and 8 ) 
ROUND cj) 3-1 RULE. 

The case i = k: A node of level k, where k < f, cannot be put in PT by the end of round k. 

The case 1 < £ < fc: let \ax\ = £ and assume correctness for every £' > £ . Since k < f, ROUND f -\- 1 RULE is 
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not applicable. 

If there is a correct predecessor in a, we are done by Item 2, since by the end of round £ + 1 process q will have 
(jx € TZTq (due to coloring), hence no node axu will be in VTq and all four statements clearly hold. 

Otherwise, if process x is correct, by Item 2, by the end of £ + 2 process q will have ax € TZTq. A node axu can 
be in T’Tq only if q applied EARLY_IT-T0-RT RULE, or STRONG_lT-TO-RT RULE in that round, so any such node will 
also be set to the value of ax, which is the same at both p and q, thus all four statements hold. 

Otherwise, there is no correct process in ax. Thus, node x has n — £ children nodes, out of which at least n — t 
are correct and out of the £ — £ others, at most ^ — £ are actively faulty and at lease £ — ^ are silent. 

We start by proving the first three statements and after that we will prove the fourth statement. 

>- Consider the case that p used RESOLVE RULE to put ax: RESOLVE RULE implies that there are £ +1 7?^7^voters. 
Each TlT-votei has a set of n — £ children nodes T^T-confirmed for {a, x,d'). Define, in such a case, by Vp the set of 
children nodes of ax that are T^T-confirmed to d! in T’Tp. By definition, each confirmed node in Vp has £ + 1 children 
nodes in RTp with the same value d'. 

Proof of Statement 3a of Item 3 for Claim 2. By definition, confirmed is defined for £ + 2 < k < (j> + 1. For node v 
being 72.7^confirmed implies that there is a set 14, such that for each v' G 14, TZTp{axvv') = d, where |14| = £ +1. If 
axv G TZTp when p puts ax, then also axv G T’Tp, otherwise ax should be in TZTp already. Moreover, if axv G TZTp, 
it should be that TZTp{axv) = d, otherwise, by coloring, TZTp{axvv') would also not be equal d. By induction, level 
£ + 1, Statement 3d, we conclude that q can’t put axv to _L. Therefore, it should be the case that when p puts ax to 
VTp, axv ^ T’Tp. In such a case, all children nodes of axv that are in TZTp are in T’Tp. Specifically, every v' G 14 is 
in T’Tp. This also implies that u ^ G. By induction, on level £ + 2, Statement 3d, we conclude that for every v' G 14, 
if axvv' G TZTq then T’Tp{axvv') = T’Tqiaxvv') = d. 

Node V has exactly n — £ — 1 children nodes. When q puts a value to node axv, all children nodes of node axv 
are not colored. There are at most n — £—1 — (£ + 1) < n — t children nodes of axv in VTq that are not in 14. 

Look at the rules q may use in order to put axv to _L. 

— Consider the case that q used IT-TO-RT RULE to put axv to _L: If q applies IT-TO-RT RULE, then it should 
have for each voter e a set C/e of n — £ processes confirmed on {ax, v, _L) in ITq. There is at least one process in 
the intersection of C/e and I 4 . Denote it by u, u G C/g fl I 4 . Observe that the definition of I 4 implies that u f v. 
Being confirmed implies that u has a set of at least £ + 1 correct processes //„ such that XTq{axvuu') = _L for 
every u' G L4. For any such u' that sends messages in this round, ITp{axvuu') = ±. If there is u' that closed 
the branch using DECAY RULE, then it did so before round k — 2, and we are done by induction. Otherwise it 
used EARLY_lT-TO-RT RULE and both q and p would assign it the same value, and therefore we also conclude that 
ITp{axvuu') = _L. If it used STRONG_lT-TO-RT RULE, then there is a set U' of at least £ + 1 correct processes such 
that ITp{axvuu') = XTq{axvuu') = _L, since all but one send the same value, and q saw n — £ of them. 

We now argue that if p has such a set of children node, it implies that if axvu G VTp, then VTp{axvu) = _L. 

Consider the various put rules p can use to put a value to VTp{axvu). Thus, if p uses EARLY_lT-TO-RT RULE in 
round £ + 3 it should be to the value XTp{axvu) = _L. If p applies IT-TO-RT RULE in round £ -f 4 it should be the case 
that XTp{axvu) = _L. By the end of round £ -f 5, all the correct children nodes in £/„ (or U'), by Item 2, will be in 
VTp with value _L and will color their subtrees in TZTp to _L. Therefore, if p applies any rule to put the value of axvu, 
it will be to _L. This contradicts the fact that u GVd. 

— Consider the case that q used RESOLVE RULE to put axv to _L: If q applies RESOLVE RULE, then it should 
have a set [4 of T^T-confirmed on {ax, v, _L) in VTq. Each u in [4 has a set Wu of size £ -f 1 such that for each 
u' G Wu PTq{axvuu') = _L. Since, at least one of the nodes in Ue is in I 4 , there is a contradiction to the induction 
on Statement 3b. 

— Consider the case that q used RELAXED RULE to put axv to _L; Contradiction, to Statement 3d. 

Thus, we are left with the option of q applying SPECIAL-BOT RULE put axv to _L, proving the statement. □ 

Proof of Statement 3b of Item 3 for Claim 2. In this case, potentially some nodes from Vp (though at most one) may 
resolve to _L. Observe that node ax in TZTq has at most n — i—{n — t) = t — i children nodes outside Vp. Since £ > 1, 
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for the claim not to hold there should be at least 2 nodes form Vp that resolve to _L. Statement 3a and the definition 
of SPECIAL-BOT RULE imply that at most one node can be resolved using SPECIAL-BOT RULE. We are done since 

t — im 

Proof of Statement 3c of Item 3 for Claim 2. The observation above implies that every node in Vp that is put in TZTq 
should be with value _L. Thus, proving this case. □ 

>- Consider the case that p used IT-TO-RT RULE to put ax: 

Statement 3a is not applicable in this case, and the rest of the cases we discuss next. 

Proof of Statement 3b of Item 3 for Claim 2. The IT-TO-RT RULE implies that in XTp there is a set C of n — t voters 
of (cr, T, d), where Each v &V has a set Wy of n — t processes that are confirmed on (tr, x, d), where u is a 

supporter to each u G Wy on {a, x, d). Each such u, being confirmed on (cr, x, d), has a set {7„ of n — f supporters in 
XTp to u on (cr, x, d). Each of these sets of size n — t contains at least t+1 correct nodes. Let Uy be the set of children 
nodes, where each one has at most t correct supporters to (cr, x, d) in XTp - The above implies that \U^\ <t — £ (notice 
that C4 C TV \ Wy). 

Assume by contradiction that q has \Vq\ > t children nodes of ax in VTq such that VTq{axu) = _L for each 
u G Vq. Therefore, there must exist two nodes, yi, j /2 G Vq that are not from the set Uy (because £ > 1 so |C/l| < 
t-£<t-l). 

We now go through the put rules q can apply to put values to the children nodes of ax. We will also study the 
minimal round at which q can apply these put rules. Since TZTq{ax) can’t have a value when the put rule is applied 
by q to the children nodes of ax, the earliest round at which q can use any other rule to put it’s value, if at all, is the 
end of £ + 2 . 

Eor round -f 2 : If f -f 2 < k, then by the end of round £ -\- 2, \t can’t be that all echoing processes to t/i or y 2 
have sent the value _L. Thus, by the end of round £ + 2 , q cannot have either yi or 2/2 in TZTq with value _L, and the 
claim holds. 

Eor round £ + ‘3 : If £ + 2, < k, then by the end of round £ -f 3, each y G {j/i, 1 / 2 } h^s t -\- f correct children that 
are supporters for d in XTq and therefore y can’t be in VTq{axy) for value _L. Thus, by the end of round £ + 3, q can 
have at most t — £ children nodes of ax in IZTq with value _L, and the claim holds. 

If £ -I- 4 < k, then by the end of round £ -b 4, by Item 2, the value of all correct nodes in all sets V, Wy and [/„ above 
are already in IZTq. This implies that yi and 1/2 each has at least £-1-1 children nodes in TZTq with value d. The value of 
neither yi nor y 2 can be put to _L using rules RESOLVE RULE, or RELAXED RULE, and clearly not SPECIAL-ROOT-BOT 
RULE. We already excluded EARLY_lT-TO-RT RULE, STRONG_lT-TO-RT RULE, and IT-TO-RT RULE, so the only rule 
that may be applied is SPECIAL-BOT RULE. But SPECIAL-BOT RULE can be applied only when all other sibling nodes 
are already in TZTq, so it can be applied to either yi or y 2 but not to both. A contradiction. This completes the proof 
of Statement 3b for this case, assuming p used IT-TO-RT RULE to put the value of TZTp{ax). □ 

Proof of Statement 3c of Item 3 for Claim 2. The proof is identical to the proof of Statement 3b with a small change, 
except that SPECIAL-BOT RULE does not produce a value that is different than _L. □ 

>- Consider the case that p used EARLY_lT-TO-RT RULE in order to put ax. 

Proof of Statement 3b of Item 3 for Claim 2. The EARLY_lT-TO-RT RULE implies that in XTp there is a set U,UC {u' \ 
u' G ax} = 0, |I7| = n — £, such that for every u,v G U \ T, XT{axu) = XT{axv). Assume first that no correct 
process closes the branch by the end of round £ -b 1. This implies that q will also see all correct processes sending the 
same value. Therefore, it can’t apply any rule on it’s XTq to put any child of ax ‘PTq with a value of ±. By the end of 
round £ -b 3, by Item 2, the value of all correct nodes in U are already in TZTq. This implies that q can’t have a set Vq 
of more than size t for any different value. 

Now, if there is u G U that closed the branch and did not send in round £ -b 1, then by the end of round £ -b 1, 
by Statement 2c, at every correct process q, 3a' \Z ax such that a' G TZTq, which implies that by the end of £ -b 1, 
ax G TZTq, contradicting our assumption. □ 
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Proof of Statement 3c of Item 3 for Claim 2. The proof is identical to the proof of Statement 3b with a small change, 
except that SPECIAL-BOT RULE does not produce a value that is different than _L. □ 

>- Consider the case that p used STRONG_lT-TO-RT RULE in order to put ax. 

Proof of Statement 3b of Item 3 for Claim 2. Assume for contradiction that 3Vq such that |Vij| = f + 1, where Vq = 
{u I VTqiaxu) = _L}. The STRONG_lT-TO-RT RULE implies that in XTp, by the end of round f + 2, there is a set U, 
Uri{u' I u' € a} = 0, \U\ = n—i+l such that for every m, v € U\iF, where v u,ITp{axuv) = XTp{axvu) = d. 

Assume first that no correct process closes the branch by the end of round £ + 2. This implies that XTq will include 
all values above appearing in XTp for correct processes. We assume that there is no correct process in ax, and that 
\ax\ > 1. Therefore \Tq \ {u' \ u' S ax}\ < t. Moreover, also \{Tp U Tq) \ {u' \ u' € ax}\ < t. Therefore, there 
should be at least two processes in 1/q that are not in-Fq. Therefore, there should be ?/ G Vq suchthaty G U\{TpUTq). 
By the dehnition of U, there is a set [/, of n —f — 1 correct processes such that XTq {axyu) = d, for rt G Uq. Therefore, 
by the end of round £ + 3 q can’t have 'PTq{axy) = _L. 

Since p applied STRONG_lT-TO-RT RULE by the end of round £+2, by the end of £+3, by Statement 2c, ax G TZTq, 
so VTq{axy) will never be set to _L. A contradiction. □ 

Proof of Statement 3c of Item 3 for Claim 2. The proof is identical to the proof of Statement 3b with a small change, 
except that SPECIAL-BOT RULE does not produce a value that is different than _L. □ 

>- Consider the case that p used RELAXED RULE to put ax'. In this case when p applies the rule, all its children 
nodes are in VTp. By induction (Statement 3d), none of these children nodes will appear with a conflicting value in 
VTq. Since n — £ — 1 of them are with the same value d!, then at most n — £—{n — t—l) = £ — £-|-lare with a different 
value. RELAXED RULE is applied only when £ > 1. This immediately implies that Statement 3b and Statement 3c 
hold. 

We completed the proof of the first 3 statements. We now prove the last one. 

Proof of Statement 3d of Item 3 for Claim 2. If \ax\ > 1, then Statement 3b and Statement 3c clearly prove that State¬ 
ment 3d holds, unless p uses SPECIAL-BOT RULE to put ax. The proof above covers the case that q uses any rule other 
than SPECIAL-BOT RULE, by symmetry between p and q in this statement. Thus, we are left with the case that both 
are using SPECIAL-BOT RULE, and clearly both put _L. 

We are left to consider the case \ax\ = 0, thus ax = e. For that we need to consider all put rules that p and q may 
have applied. There are 3 applicable rules, IT-TO-RT RULE, RESOLVE RULE, and SPECIAL-ROOT-BOT RULE, to put 
a value to e. Notice that SPECIAL-BOT RULE and RELAXED RULE are not applicable and EARLY_iT-TO-RT RULE, or 
STRONG_lT-TO-RT RULE were covered in Statement 2c. 

Node e has n children nodes, out of which at least n — t are correct and out of the t others, at most f are actively 
faulty and at lease t — f are silent. Notice that for e, every child node that is in IZTp is also in VTp, since once we 
assign a value to e we do not process any other node. 

>- Consider the case that p used EARLY_lT-TO-RT RULE or STRONG_lT-TO-RT RULE to put a value to e; Both rules 
imply that p sees a unanimous echoing by all n processes, with the exception of at most one process. Since we assume 
that all correct processes participate, there is no way that q will put a different value to e. 

>- Consider the case that p used IT-TO-RT RULE to put e and that IZTpfe) = -L: The basic arguments are the same 
as in the case £ > 1, but the set of put rules that q may apply differ. If q also uses IT-TO-RT RULE, then the claim 
clearly holds. If q uses SPECIAL-ROOT-BOT RULE, then it obtains the same value. So we are left with the case of q 
using RESOLVE RULE. The arguments are the same as in the case £ > 1, which exclude the possibility that q puts any 
value other than _L to e, completing the proof of this case. 

>- Consider the case that p used IT-TO-RT RULE to put e and that IZTpfe) = d, d T. We now need to consider 
the possibility of q using IT-TO-RT RULE, RESOLVE RULE and SPECIAL-ROOT-BOT RULE. The arguments for the first 
two are the same as above and are left out. 

For using SPECIAL-ROOT-BOT RULE node q should have a set Vq, \Vq\ = t + 1, such that for each v G Vq, 
TZTq{v) = _L. Notice that also here there is no difference between IZTq{v) and VTq{v). 
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The IT-TO-RT RULE implies that in XTp there is a set Vp of n — t voters of (e, e, d). Each v G Vp has a set of Wy 
of n — t children nodes such that u is a supporter to each u G Wy on (e, e, d), and each such u has a set Uu of n — t 
supporters in ITp to (e, e, d). Each of these sets of size n — t contains at least f + 1 correct nodes. Thus, there is a set 
[4 of size at most t that does not have at least t + 1 correct supporters to (e, e, d). 

Thus, there should be a process x G Vq that has a set UxOfn — t supporters in ITp to (e, e, d). The set contains a 
set IJx of at least t + 1 correct processes that are also supporters in ITq to (e, e, d). Consider the various rules q can 
apply to put a value _L to TlTq{x). By the end of the 2nd round it can apply EARLY_IT-T0-RT RULE to set a _L to it, 
because all processes in Ux send a different value. Eor that reason it can’t apply IT-TO-RT RULE in the end of round 
3 to put value _L to TZTq{x). By the end of round 4 for every process y G Ux VTq{xy) = d. Process q can’t apply 
SPECIAL-BOT RULE to put a value _L to X, since that rule is not applicable for |a;| = 1. RESOLVE RULE, RELAXED 
RULE or STRONG_lT-TO-RT RULE, can’t be used to put _L. Since we assume that k < cj), the case (^ = 1 is not relevant. 
Therefore also ROUND (j) + 1 RULE can’t be applied either - and we are done. 

>- Consider the case that p used RESOLVE RULE to put e: If q uses IT-TO-RT RULE, by symmetry we are done. If 
q also uses RESOLVE RULE, by definition both obtain the same value. We are left with the case that q uses SPECIAL- 
ROOT-BOT RULE. The interesting case is that TZTp{t) = d, d ^ I. Observe that we cannot use the induction on fc = 1 
since the set of applicable rules differ. The arguments for proving the case are similar to the previous case, the case of 
IT-TO-RT RULE, since q can’t apply SPECIAL-BOT RULE to any node in level 1. 

>- Consider the case that p used SPECIAL-ROOT-BOT RULE to put e; If q also uses it the claim holds. Otherwise it 
falls into the other rules discussed above. 

This completes the proof of Statement 3d . □ 

This completes the proof of Item 3 (Safety) for Claim 2. □ 

Proof of Item 4 for Claim 2 . (Liveness) It is enough to prove that if p G G puts ax G VTp in some round r < k, then 
by the end of round max(r + 2, f + 1) ax G TZTq, for every q G G. 

We prove the lemma by backward induction on £ = \ax\, from i = kto £ = 1. As in the proof of Item 3, the claim 
clearly holds for £ = k, since no node of level k, k < f + 1 can be added to VThy the end of round k. The case 
f = fc — 1 is applicable only to EARLY_lT-TO-RT RULE, and is covered by the proof of Statement 2c. 

Assume the induction for any k > £' > £ and we will prove for £,£<f — 1. If ax contains a correct node then by 
induction on Item 2 we are done. So assume that there is no correct process in ax. Let p be the first to put ax, where 
ax G VTp, and let r be the round at which it did that. Consider the various possible put rules. 

> Case p applied EARLY_iT-TO-RT RULE, or STRONG_lT-TO-RT RULE Statement 2c implies the proof. 

>- Case p applied IT-TO-RT RULE; By definition, this can happen only in round r = £ + 2. The IT-TO-RT RULE 
implies that there are f -f 1 correct voters of {a, x, d) in ITp, each having n — t nodes, each of which is confirmed on 
{a, X, d) in ITp. Let Ux be the set of the confirmed nodes on {a, x, d) in ITp and I 4 the set of correct voters. Observe 
that Ux contains at least t + 1 correct processes. 

If by round r + 1 ax G TZTq we are done. If not, then if for any u G Ux axu G TZTq, it should be in VTq, and it 
should be with a value d, because of using either IT-TO-RT RULE, EARLY_lT-TO-RT RULE, or STRONG_lT-TO-RT RULE 
by q, and it can’t obtain a different value, because of the correct processes in Ux and I 4 . 

If by round r + 2 ax G TZTq we are done. If not. Item 2 implies that by max(r -|- 2, k) all voters in 14 will appear 
in TZTq as TTT^voters on {a, x, d), since the nodes in Ux will be confirmed to (cr, x, d). These arguments and Item 3 
imply that if any of them is colored, it should be colored to d. Therefore, q can apply RESOLVE RULE to add ax to 
VTq and we are done. 

>- Case p applied RESOLVE rule: By definition, assuming that no branch closing took place, this can happen 
only in some round r > £ + 4. Assume first that f > 1, we later deal with smaller values of £. If by the end of round 
r + 2 process q puts a value to ax or to a predecessor of ax, we are done. Otherwise, by the induction hypothesis, 
by r -b 2, each node involved in applying RESOLVE RULE by p to ax in VTp is either colored or its value put by q in 
TZTq. We will show that by r -b 2 process q can apply one of the rules to put a value to ax in VTq. 

Let Vp be that set of children nodes of ax that are TTT^confirmed to d' in TZTp. By Statement 3d, for every v G Vp, 
if axv G VTp and axv G VTq, then VTp{axv) = VTq{axv). 
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None of the nodes in Vp can be confirmed to a different value than d' in TZT'g, unless it was put by g to a different 
value. If d' ^ J- this can happen to the value of _L and by Statement 3a, this can happen only using SPECIAL-BOT 
RULE. Thus, there can be at most one such node z € Vp that was set to _L by q. 

If none was set using SPECIAL-BOT RULE, then by r + 2 process q should see the same set of voters that p did 
and is able to apply RESOLVE RULE. Otherwise, it should have applied the SPECIAL-BOT RULE to one of the nodes in 
Vp. Before it can apply SPECIAL-BOT RULE, all other children nodes of ax should be put to a value. After applying 
SPECIAL-BOT RULE to node axz all the children nodes of ax have a value in TZTq. For every y € Vp the value is d', 
so q, by that time, would have at least n — t — 1 children nodes set to d'. Thus, q can apply RELAXED RULE to put a 
value to ax and the claim holds. 

In the case of f = 1, by definition q can’t apply SPECIAL-BOT RULE to set a value to z. And therefore it should 
have been able to use RESOLVE RULE to set a value to ax. The case of £ = 0 is similar to the case of £ = 1. 

>• Case p applied RELAXED RULE; since p applies this rule, all the children nodes of ax are put to a value in PTp 
and by induction by r -b 2 also at q. If ax € TZTq, we are done. Otherwise, by Statement 3d their value is the same as 
forp and process q can also apply RELAXED RULE. 

>• Case p applied SPECIAL-BOT RULE or SPECIAL-ROOT-BOT RULE: exactly as in the previous case. 

This completes the proof of Item 4 (Liveness) for Claim 2. □ 

This completes the proof of Claim 2. □ 

We can now complete the proof of the Theorem by covering the case of k € + 1} 

Proof of Claim 3. We cover both cases for each item. 

Proof of Item 1 for Claim 3. (Detection) There is no special issues that surface in the last two round regarding detec¬ 
tion, and the proof for the case k < f holds. □ 

Proof of Item 2 for Claim 3. (Liveness) 

>- Consider the case k = f: There is no difference between the arguments for this case and those of k < f. 

>- Consider the case fc = ^ -|- 1: If \ax\ = f and if any correct process is not sending in this round it is because 

of applying the TZT[r — 3] limitation, and by induction we are done. Otherwise, if z sends, then ROUND f + 1 RULE 
completes the proof. The case \ax\ < </) is identical to that of k < f. 

The proof of Statement 2b is similar to the case in which a correct process sends _L in the first round (Statement 2a). 

□ 


Proof of Item 3 for Claim 3. (Safety) Item 3 is not applicable in case k = f + 1. 

Consider the case k = f. The case | era; | = f: A value to a node at this level can’t be put at any round < f. Observe 
that two correct processes may put conflicting values in their TZT to a node axy at level f+l that is associated with a 
faulty process, since they may have conflicting values in their XT for that node. This may happen only if there wasn’t 
any correct predecessor of x in a, since Item 2 implies that before assigning y a value it would already be colored. By 
Property 1, there is no conflict on all the t — f faulty nodes that are initially in TA. Thus, there can be at most one 
faulty node in level f+f. Item 2 also implies that during round f+l node ax will be assigned a value by all correct 
processes, and therefore so will node axy. 

Statement 3a is not applicable in the case of |cra;| = f. 

Proof of Statement 3c for Claim 3. Node ax was put to _L by process p. By the assumption of Statement 3c, SPECIAL- 
BOT RULE wasn’t applied. ROUND (p +1 RULE is not applicable, since we are in level </>. IT-TO-RT RULE and RESOLVE 
RULE are not relevant, since there is only a single level of nodes in IT or TZT SPECIAL-ROOT-BOT RULE is relevant 
only for the case of ax = e, which can’t happen for |cra;| = ^. If process p uses EARLY_IT-T0-RT RULE, or STRONG.IT- 
TO-RT RULE, then similar arguments to those used in the proof of Statement 2c can be used. 

We are left with RELAXED RULE. Node ax has n — t — 1 children nodes in TZTp all having the value _L and all 
but one are clearly correct nodes. Since there are exactly n — f — 1 nodes in level f + 1, and there can be at most 
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n — 4> — 1 — {n — t — 2) = t — (j> + 1 nodes holding a non _L value. Thus, node q can’t have t + 1 or more children 
nodes with a value not _L when it applies it’s put operation; Completing the arguments for Statement 3c. □ 

Proof of Statement 3b for Claim 3. Node ax was put to d, d f J- hy process p. Thus, SPECIAL-BOT RULE is not 
applicable and, as in Statement 3c, we are left with RELAXED RULE. Node ax has n — t — 1 children nodes in TZTq 
all having the value d and all but one are clearly correct nodes. Since there are exactly n — f — 1 nodes in level f + l, 
and there can be at most n — f — 1 — {n — t — 2) = t — f + 1 nodes holding a non d value. Thus, node q can’t 
have f + 1 or more children nodes with a value not d when it applies it’s put operation; completing the arguments for 
Statement 3b. □ 

Proof of Statement 3d for Claim 3. Case d = -L, if ax G VTq, then by Statement 3c, the only applicable rules for q 
are RELAXED RULE, or SPECIAL-BOT RULE. Both will result in validating the claim. Case d = fftf ax G VTq, then 
by Statement 3b it is clear that the only possible rule to be applied is RELAXED RULE, which results in validating the 
claim for Statement 3d. □ 

For node \ax\ < f — 1 identical arguments to those used in the proof of Claim 2 complete the proof of Item 3 
(Safety) for Claim 3. □ 

Proof of Item 4 for Claim 3 . (Liveness) The arguments for this item are the same for k = f and k = (j> + 1- As 
we mentioned before, it is enough to prove that if p G G puts ax G VTp in some round r, then by the end of 
max(r + 2,(j} + 1) ax G TZTq, for every q G G. The proof is by backward induction on f = |cra;|. 

Case i = (j) + 1. The only round at which a process can put a value to a node in level (/) + 1 in it’s TZTis during 
round ^ -f 1. At that round, every correct process that doesn’t have ax in its TZT as a colored node, will insert it to its 
TZT using ROUND f + l RULE. 

Case £ = (j). Either p used EARLY_iT-TO-RT RULE, or STRONG_lT-TO-RT RULE or it set the value in round f + l. 
If it used EARLY_lT-TO-RT RULE, or STRONG_lT-TO-RT RULE, then Statement 2c completes the proof. Now we need 
to consider the various potential put rules p applied in round </> + 1 in order to put the value for ax. IT-TO-RT RULE 
and RESOLVE RULE are not applicable in this case. 

>- Case p applied RELAXED RULE: If exists a correct process in a then by Item 2 we are done. If a; ^ G then all 
children nodes of ax are either correct or silent, and if p applies the rule, every correct process can apply the same 
rule. If X G G, then there are n — t — 1 correct children nodes of ax, all of which will send the same value, and all 
will apply the RELAXED RULE, completing the proof of this case. 

>• Case p applied SPECIAL-BOT RULE: using similar arguments as above, this case is applicable only if a; ^ G, 
and as the arguments above show, if any correct process applies this rule, all will. 

For node \a\ < f—l identical arguments to those used in the proof of Claim 2 complete the proof of this case. □ 

This completes the proof of Claim 3. □ 

We now prove the theorem for the case of f = 1. 

By assumption there is at most one faulty process, say b, that doesn’t appear in TA of any correct process. There 
are at most two rounds of information exchange. 

In the first round every process sends its input value. By the end of the first round, at every z, ITz{£) = dz- 
VTz{z) = dz, and for every x G N \ Tz, TTz = dx, where dx is the value received from x, and for every y G Tz, 
TTz = -L. The only rule that may be applied by a correct process by the end of this round is the EARLY_lT-TO-RT 
RULE. 

Assume that z applies the EARLY_lT-TO-RT RULE by the end of round 1. This can happen only when all inputs 
are _L or when = 0 and all input values are identical. If this happen, z sets TCTz (e) = TTz (e) = dz- z does not 
send any message in round 2. Following that, every correct process p that doesn’t stop sends to every process the set 
of values it entered to ITp{x) for every x G N. If a correct process z stops, all these values are identical, other than 
the values associated with b. Moreover, for z and any other correct process that did not send a message, all correct 
processes add to their IT the same value for it. By the end of round 2, every correct process, p, that did not stop 
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applies ROUND (/) + 1 RULE to copy ITp{cr), for cr G E 2 to TZTp{(j). By the previous discussion it is clear that all 
will have identical values regarding all node, other than maybe the nodes on a that include b. Therefore, every correct 
process p will be able to apply RELAXED RULE and will put the same value to e. 

This discussion shows, implicitly that all the items of the theorem hold for this case. 

Now consider the case that no correct 2 : stopped at the end of round 1. In the second round every correct process 
sends to every process the set of values it entered to ITp{x) for every x € N. By the end of round 2 every correct 
process, p, applies ROUND cj) + 1 RULE to copy ITp{cr), for ct G S 2 to TiTp{<j). By the end of the second round, for 
every p, q, z correct processes TZTp{zp) = TlTp{zq) = TZTq{zp). 

Since b is the only potentially faulty process we conclude that for every p, q,z € N \ {b}, TZTp{bz) = TZTq{bz). 
We now show that the theorem holds in this case. 

To prove that Item 2 (Validity) holds, let’s look at its three statements. Statement 2c vacuously holds. Statement 2a 
holds, since for every correct process that sends in the last round there is consensus. For every p in G that sends in 
the first round, as we mentioned before, all processes, but b, sent the same value dp that p sent in the first round, and 
by applying RELAXED RULE, which can be applied to node p, all reach consensus. For every p G TA, the same 
arguments hold. 

To prove that Item 3 holds, let’s look at its four statements. 

Proof of Statement 3a when f = 1. By definition, node p, p € G, can apply RESOLVE RULE only on node e. Assume 
it resolved to d, c? 7 ^ _L. By definition p observed at least 2 processes as voters to d, and it identified n — t 7?.7^confirmed 
nodes. All correct processes among them will never resolve to _L. The only possibility that another correct process q 
can resolve any to _L is node b. If node b is 7?.7^confirmed, it has at least 2 children nodes x, y such that TZTp{bx) = 
TZTp{by) = _L. Since both x and y are necessarily correct processes, we conclude that TZTq{bx) = TZTq{by) = _L. 
Moreover, for all 7?.T-confirmed nodes 2 in TZTp, except of node b, TZTp{z) = TiTq{z) = d, since all are correct. The 
only rule q may be able to apply to resolve 6 to _L is SPECIAL-BOT RULE. But SPECIAL-BOT RULE is not applicable 
to nodes of level 1 . □ 

Proof of Statement 3b and Statement 3c when </> = !. These statements clearly hold since VTp is defined only for 
nodes in level 1 , and VTq is not defined for level f + 1. □ 

Proof of Statement 3d when f = 1. Consider three cases, if x G G, then by Item 2 we conclude equality. Consider 
the case that x = b. In this case, as we wrote above, for every p,q, z G N\{b}, Ti.Tp{bz) = TZTqibz). Therefore, if p 
applied a rule to conclude b G VTp, so will q. We are left with the case of x = e. As we just proved, on every node of 
level 1, p and q agrees. All but one of them are nodes associated with correct processes. The only node on level 2 on 
which p and q differ is node xb. But because of coloring, both color node xb by the value of x. Therefore, on every 
node cr, |tT| > 1 if tr G TZTp, then a G TZTq. Therefore, every rule p applies holds also for q. This completes the proof 
of Statement 3d. □ 

To prove that Item 4 holds consider the 3 possible levels. ROUND </> + 1 RULE implies that it holds for level f+l. 
Statement 3d proves the rest of the cases. 

To prove that Item 1 holds observe that by Property 1, it holds initially. In the first round, no detection takes place. 
In the 2nd round, no correct process suspects any other correct process. 

This completes the proof of Theorem 2. □ 

The following Theorem summarizes the properties needed from our protocol. 

Theorem 3. For a {t, cjfj-adversary and protocol TT^ and n > 3f +1 and assuming that all correct processes participate 
in the protocol: 

1. Every correct process outputs the same value. 

2. If the input values of all correct processes are the same, this is the output value. Every correct process outputs 
it by round 2 and stops by round 3. 
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3. Ift + 1 of the correct processes hold an input value o/_L, then all correct processes output _L by the end of round 
3 and stop by the end of round 4. 

4. If the actual number of faults is f^ < f, then all correct processes complete the protocol by the end of round 

f<t> + 2 - 

5. If the actual number of faults is = 0, and all correct processes start with the same initial value, then all 
correct processes complete the protocol by the end of round 1 . 

6 . If the actual number of faults is = 1, and all correct processes start with the same initial value, then all 
correct processes complete the protocol by the end of round 2. 

7. If a correct process outputs in round k, it stops by the end of round fc + 1. 

8. If a correct process stops in the end of round k, all correct processes output by round fc + 1 and stop by round 
k + 2. 

Proof of Theorem 3. 

Proof of Statement 1; By definition a correct process outputs a value once it identifies a frontier. It is clear that by 
the end of round f + 1 there is a frontier for every correct process. Define the front of TZT to be: ax is in the front of 
TZTif exists p £ G such that ax £ TZTp and for every q £ G, a ^ TZTq- Theorem 2 implies that if a is in the front of 
process p, within two rounds it will be in the front of any other correct process. Since all correct processes shares the 
front, then if e G TZTp, it will be at every other correct process and vice versa. Since a process does not stop for two 
rounds after it holds a frontier the first claim holds. 

Proof of Statement 2: Lemma 1 proves the second claim. 

Proof of Statement 3: The proof of Theorem 2 implies that special-root-bot rule can be applied by the end of 
round 3 if there are f + 1 correct processes that start with input _L. Thus, the third claim holds. 

Proof of Statement 4: Observe that if the actual number of faults is and then for every a £ there is a 

prefix of length k, k < f^ + 1 in which a correct process appears as the last node. If k < f—l then by Theorem 2, by 
k + 2 every correct process will have that prefix in its TZT and will be able to apply DECAY RULE to close the branch 
by the end of round f + 2. 

Consider a prefix rp of length <j)+l. By assumption r contains all faulty processes. Therefore, by the end of round 
<j) + 2, every correct process will be able to apply EARLY_lT-TO-RT RULE to add rp to TZT and will close the branch. 
Observe that sometimes more than one rule can be applied, but since we go down from the later rounds to the earlier 
ones, we happen to close the branch earlier. 

We are left with the case of rp of length f. There is at most one corrupt node, say x, that can send values relating 
to rp that will be added to the IT of correct processes in rounds f + 1 and round f + 2. In round </> + 1 all correct 
processes becomes children nodes and by the end of round + 2 all will add rp to their 'PT and would be able to 
apply STRONG_lT-TO-RT RULE to close the branch. 

Thus, in all cases, by the end of round (/) + 2 all correct processes will close all branches and can output a value. 
Proof of Statement 5: since there are no faults, all correct processes apply EARLY_lT-TO-RT RULE by the end of the 
first round to set a value to e. 

Proof of Statement 6: since there is a single fault, all correct processes apply STRONG_lT-TO-RT RULE by the end of 
the 2nd round to set a value to e. 

Proof of Statement 7: The branch closing rules immediately imply that there can be at most one round between 
adding the final value to TZT that produces the frontier, thus providing output, and closing of all branches that imply 
stopping the protocol. 

Proof of Statement 8: The first part of the statement holds, since if p stops by the end of round k, it doesn’t send 
anything in round fc + 1. Theorem 2 (Statement 2c) imply that by the end of that round every correct process will 
output a value, and by the previous statement all will stop by the end of fc + 2. □ 


4 Monitors 

We follow the approach of [BG93, GM93, GM98] with some modifications for guaranteeing early stopping. 
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In round r = 1 we run Vt using the initial values. For each integer k, in round l<r = l + 4A:<t—Iwe invoke 
protocol Vt-i-ik whose initial values is either _L (meaning everything is OK) or BAD (meaning that too many corrupt 
processes were detected). We call this sequence of protocols the basic monitor sequence. We will actually run 4 such 
sequences. 

4.1 The Basic Monitor Protocol 

Each process z stores two variables: v € D, the current value, and early, a boolean value. Initially v equals the initial 
input of process z and early := false. Later, early = true will be an indicator that the next decision protocol must 
decide _L (because there is not enough support for BAD). Each process remembers the last value of earlyq it received 
from every other process q, even if q did not send one recently. 

Throughout this section we use the notation: r = r (mod 4). 


Algorithm 1: The Basic Monitor protocol (at process z) 

1; if f = 1: 

2 ; if r < t — 1 then invoke protocol Dt+l—r with initial value Vz', 

3; if f = 2: 

4; at the end of the round: 

5 ; if \iFA\ > r + 3 then set Vz := BAD 

6: Otherwise set Vz :=-L; 

7; if f = 3: 

8; send Vz to all; 

9; at the end of the round: 

10 ; ¥ \{q \ Ug = BAD}| < t then set earlyz := true 

11 ; otherwise set earlyz .= false', 

12; if f = 0: 

13; send earlyz to all; 

14; at the end of the round; 

15; ¥ \{q I earlyq = true}\ > t + 1 then set Vz := -L; 

16 ; if eveiy previously invoked protocol produced an output then set Vz := -L. 


The monitor protocol runs in the background until the process halts. The monitor protocol invokes a new 
protocol every 4 rounds. In each round, the monitor’s lines of code are executed before running all the other protocols, 
and its end of round lines of code are executed before ending the current round in all currently running protocols. 
This is important, since it needs to detect, for example, whether all currently running protocols produced outputs for 
determining its variable for the next round. At the end of each round the monitor protocol applies the monitorJialting 
and monitondecision rules below to determine whether to halt all the running protocols at once, or only to commit to 
the final decision value. 

When a process is instructed to apply a monitondecision it applies the following definition. If it is instructed to 
halt (monitorJialting), then if it did not previously apply the monitondecision, it applies monitondecision first and then 
halts all currently running protocols that were invoked by the monitor at once. 

Definition 1 (monitondecision). A process that did not previously decide, decides BAD, if any previously invoked 
protocol outputs BAD. Otherwise, it decides on the output ofT>t. 

When a process is instructed to decide without halting, it may need to continue running all protocols for few more 
rounds to help others to decide. We define “halt by r + x" to mean continue to run all active protocols until the end of 
round min{r + x,tf-1 }, unless an halt is issued earlier. 

4.2 Monitor Halting and Decision Conditions 

Given that different processes may end various invocations of the protocols in different rounds we need a rule to make 
sure that all running protocols end by the end of round / + 2. The challenge in stopping all protocols by the end 
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of / + 2 is the fact that individual protocols may end at round / + 2 and we do not have a room to exchange extra 
messages among the processes. This also implies that we need to have a halting rule at every round of the monitor 
protocol, since / + 2 may occur at any round. 

Each halting rule implies how other rules need to be enforced in later rounds, since any process may be the first to 
apply a monitorJialting at a given round and we need to ensure that for every extension of the protocols, until everyone 
decides, all will reach the same decision despite the fact that those that have halted are not participating any more. The 
conditions take into account processes that may have halted. A process considers another one as halted if it doesn’t 
receive any message from it in any of the concurrently running set of invoked protocols, monitors and the gossiping of 
T. 

To achieve that we add the following set of rules. 

Monitor Halting Rnles: 

^^BAD- Apply monitorJialting if any monitor stops with output BAD. Otherwise if any monitor outputs BAD, apply 
monitondecision now and monitorJialting by r + 2. 

i/i. Case r = 1; 

(a) If all previously invoked protocols stopped, apply monitorJialting. 

(b) Otherwise, if only the latest invoked protocol did not stop and |{q | earlyq = true or q halted} | > n — t, 
then apply monitorJialting. 

(c) Otherwise, if only the latest invoked protocol did not stop and [{g | earlyq = true or q halted} | > t + 1, 
then apply monitondecision now and monitorJialting by r + 2. 

i/ 2 - Case r = 2; 

(a) If all previously invoked protocols stopped, apply monitorJialting. 

(b) Otherwise, if only the latest invoked protocol did not stop and |{g | earlyq = true or q halted} | > n — t 
was true in the previous round, then apply monitorJialting. 

(c) Otherwise, if only the latest invoked protocol did not stop and Ijg | earlyq = true or q halted} | > t + 1 
was true in the previous round, then apply monitondecision and now and monitorJialting by r + 1. 

i/ 3 . Case r = 3: If all previously invoked protocols stopped, apply monitorJialting. 

i/ 4 . Case f = 0; If all previously invoked protocols stopped and [{g | earlyq = true or q halted} | > n — t then 
apply monitorJialting. 

Lemma 3. Ifn > 3i and there are f,f<t, corrupt processes then all correct processes apply monitorJialting by the 
end of round min(f + 1 , / + 2 ). 

Proof. We need to show that all previously invoked protocols halt by the end of round min(f + 1, / + 2). Observe that 
Theorem 3 (Statement 4), implies that Tf itself is stopped by min(i + 1, / + 2). 

By definition, protocol T)^ is invoked in round where ^ = i + 1 — r^. By Theorem 3 (Statement 4), T)^ is 
stopped by min((/) + + 2 ), if the upper bound on the number of faults (that were not detected by all correct 

processes before invoking the protocol) is t^. Note that if the number of faults that are not detected by all is higher 
than tcf, the protocol may not stop by ^ + 1 . 

Let’s study the number of faults that are not detected by all correct processes when is invoked. Figure 1 Line 3 
indicates that if any correct p set Vp := BAD in round — 3, then, by Lemma 2, the number of faults that are not 
detected by all correct processes when is invoked is at most f — r^. In such a case, by Theorem 3, will be 
stopped by round min(0 + l,t^ + 2), where t^p < t — rp,. Let us call these X}, regular-protocols. 

If no correct p sets Vp := BAD, then all correct processes invoke Vp, with u = X, therefore no matter how many 
faults are present (as long as not more than t). Lemma 1 guarantees that Vp, is stopped within 3 rounds, and all outputs 
are obtained within 2 rounds. Let us call these Vp, fast protocols. 

For regular-protocols we need to prove that the extra conditions hold. In addition, for fast-protocols we need also 
to prove that the protocol that was invoked recently will also stop in time. 

Let us consider the r (mod 4) round at which min(f + 1, f + 2) falls. 

Case min(f -I- 1, / -f 2) (mod 4) = 0: By i /4 we need to show that all previously invoked protocols will be stopped 
and that Ijq | earlpq = true or q halted}| > n — f, at every correct process. 
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For regular-protocols, since all are stopped by round min(i + 1, f + 2) then when correct processes executed 
Line 3, just before stopping, none would set v := BAD. Therefore, all will set u to _L and later early to true. Thus, 
the extra property for holds, and all will halt. 

For fast-protocols, since no process sets v to BAD, every previously invoked protocol stops within at most 3 rounds 
(Theorem 3, Statement 2). The latest protocol was invoked 3 rounds ago, and we are done. The arguments for the 
extra condition in H 4 are the same as for the regular-protocols. 

Case min(f + 1, f + 2) (mod 4) = 3: By H 3 we need to show that all previously invoked protocols will be stopped. 

The arguments for regular-protocols and for fast protocols are the same, the latest invocation was two rounds ago, 
and therefore, by Theorem 3 (Statement 2), by the end of the current round all will be stopped. 

Case min(f -I- 1,/ -I- 2) (mod 4) = 2: By H 2 we need to show that either all previously invoked protocols have 
stopped by the end of the current round, or all but the last one and the extra condition holds. 

If min(f -|-l,/-|-2) = t + 1, then no protocol was invoked in the previous round, by definition. All previous 
regular or fast protocols will be stopped by the end of the current round. 

If min(f-|-1,/-|-2) = /-|-2, by Theorem 3 (Statement 4), using similar arguments as above, all previous protocols 
will be stopped by the end of the current round, except, maybe the last protocol that was invoked in the previous round. 
Observe that correct processes set up their v four rounds ago. Since the current round is / -|- 2, then the round at which 
the processes executed Line 3 in Figure 1 is / — 2 and therefore no process could have more than / faults, and would 
have set v := _L. Therefore, every correct process that haven’t halt yet would send early = true two rounds ago, and 
therefore the extra condition for H 2 holds. 

Case min(f -f 1,/ -I- 2) (mod 4) = 1: By Hi we need to show that either all previously invoked protocols have 
stopped by the end of the current round, or all but the last one and the extra condition holds. 

If min(f + 1, f + 2) = t + 1, then no protocol was invoked in the current round, by definition. All previous regular 
or fast protocols will be stopped by the end of the current round. 

If min(f -1-1,/-|-2) = / -|-2,by Theorem 3 (Statement 4) using similar arguments as above, all previous protocols 
will be stopped by the end of the current round, except, maybe the last protocol that was invoked in the previous round. 
Observe that correct processes set up their v three rounds ago. Since the current round is / 4- 2, then the round at 
which the processes executed Line 3 in Figure 1 is / — 1 and therefore no process could have more than / faults, and 
would have set v := _L. Therefore, every correct process that haven’t halt yet would send early = true two rounds 
ago, and therefore the extra condition for Hi holds. □ 

Lemma 4. If the first process applies monitorJialting in round r on d then every correct process applies moni- 
torjdecision by round min{r -|- 4, / -|- 2, f -|- 1}, applies monitorJialting by round min{r -|- 5, / -|- 2, f -|- 1}, and 
obtains the same decision value, d. 

Proof Let p be a correct process applying monitorJialting in the earliest round that any correct process applies it. 

Observe that in some of the halting rules a process decides before the last invoked protocol outputs a value. There 
may be cases that one process halts and other processes continue to run and even invoke an additional protocol after 
the halting. We later prove that whenever these cases happen, the decision value is the same and it not BAD. We show 
that any protocol whose output is not taken into account by any correct process must output J_. 

Consider first the case that p halts with output BAD. By Theorem 3 (Statement 1 and Statement 8), if p halts with 
output BAD and if the output of that protocol is not ignored by any correct process then all correct processes will output 
BAD by next round and will halt within two rounds. This will lead to unanimous decision. 

So pending on the fact that we later prove that any protocol whose output is not taken into account by any correct 
process will output J_, we are left to consider the case that p does not output BAD. 

If r = min(f + 1, f + 2), we are done by Lemma 3 (and Theorem 3, Statement 1). Since every correct process 
considers the outputs of the same set of protocols, the decision value is the same at every correct process. 

Consider the various halting rules used by p to apply monitothalting, and let r be the round at which it was applied. 
Case p uses Hp. There are three possibilities, one in which p noticed that all previously invoked protocols stopped. In 
this case. Theorem 3 (Statement 8) implies that all correct processes will observe that all previously invoked protocols 
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reported output by the end of r + 1 and will observe that all previously invoked protocols have stopped by the end of 
round r + 2 and will use rule H 3 to apply monitorJialting. All correct obtain the same decision value, since all will 
consider the same set of protocols and, by Theorem 3 (Statement 1) and the decision rule, will decide the same. 

Otherwise, when p executed round r it noticed that by the end of that round all previous protocols stopped and 
only the one that started at the beginning of round r did not stop yet and the values of early that p received in round 
r — 1 imply that Ijg | earlpq = true or q halted}| > n — t. Since no process halted earlier, in round r — 1 every 
correct process sets v := _L. By Lemma 1, the protocol that started in round r will produce output of _L in round r + 1 
at all correct processes that did not stop earlier, and will stop by round r + 2. Thus, every correct process will apply 
either H 2 or H 3 and will reach the same decision. 

Otherwise, when p executed round r — 2 it noticed that by the end of that round all previous protocols stopped and 
only the one that started at the beginning of round r — 2 did not stop yet. Moreover, p received at the beginning of 
round r — 3, [{g | earlyq = true or q halted} | > f + 1. Since no correct halted earlier, the instruction to set the value 
for early implies that there was a correct process q that set its earlyq to true in round r — 4. Thus, q received less 
than t BAD. This implies that there are f + 1 correct processes with u = _L. Lemma 1, implies that the last protocol 
starting in the beginning of round r — 2 will output value _L by the end of round r and stop by the end of round r + 1. 
By the end of round r all correct processes will observe the outputs of all previously invoked protocols. Therefore, 
by the end of round r + 1 all correct processes that did not apply monitorJialting already, will either be able to apply 
monitorJialting by the end of that round, or will set 1 ; := J_, since all previously invoked protocols produced output 
and even stopped. Since the latest invoked protocol is guarantee to produce an output of J_, those that have halted will 
reach the same decision. Notice that those processes that do not halt will start another protocol in which every correct 
process that invoked it has input J_ and the rest are not participating. By Corollary 1, by the end of round r + 3 they 
will decide the same decision value and will halt by the end of round r + 4. 

Case p uses H 2 '. As in the previous case, there are three possibilities, one in which p noticed that all previously 
invoked protocols stopped. In this case. Lemma 1 implies that all correct processes will observe that all previously 
invoked protocols reported output by the end of r + J and have stopped by the end of round r + 2. Some may use rule 
H 3 or rule H 4 to apply monitorJialting and decide the same, and some will invoke the next protocol with input J_ and 
will reach the same decision by round r + 4 and will halt by the end of round r + 5. 

Otherwise, when p executed round r it noticed that by the end of that round all previous protocols stopped and 
only the one that started at the beginning of round r — 1 did not stop yet and the values of early that p received in 
round r — 2 imply that \{q \ earlyq = true or q halted}| > n — t. Since no correct process halted earlier, in round 
r — 2 every correct process sets u := J_. The protocol that started in round r — 1 will produce output of J_ in round r 
and stop by round r + 1. Thus, every correct process will reach the same decision and will use rule i /3 to halt by the 
end of round r + 1. 

Otherwise, when p executed round r — 1 it noticed that by the end of that round all previous protocols stopped 
and only the one that started at the beginning of round r — 2 did not stop yet. Moreover, p received in round r — 3, 
|{g I earlyq = true or q halted} | > i + 1. And since no correct process halted earlier, as in the case for halting rule 
Hi, we are done. 

Case p uses i/ 3 : Here we need to consider the case were all previously invoked protocols were stopped. In this 
case every other correct process that did not apply monitorJialting in round r will notice currently running protocols 
producing outputs by the end of round ?’ + I (Theorem 3, Statement 8 ) and stopping by the end of round r + 2. 
Therefore, by the end of in round r + I every correct process that will not halt by the end of round r + I will set 
V := J_. Thus, all correct processes participating in the new protocol in round r + 2 will have an input J_, and every 
correct process not participating will assume to have an input J_. Thus, (Corollary 1) by the end of round r + 3 that 
protocol produces an output, and all decides the same decision value and halt by the end of round r + 4. 

Case p uses H 4 : Here we need to consider the case where all previously invoked protocols were stopped, and, in 
addition, p observes |{g | earlyq = true or q halted} | > n — t, which leads to halting by the end of round r. In 
this case, every other correct process that did not apply monitorJialting in round r will notice all previously invoked 
protocols producing outputs by the end of round r+1 and stopping by the end of round r+2 (Theorem 3, Statement 8). 
The property |{g | earlyq = true or q halted} | > n — t implies that by the end of round r + 1 or r + 2 every correct 
process will notice |{g | earlyq = true or q halted}| > t + 1. By the end of round r + 1 all correct processes that 
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did not halt in round r, but noticed that all previously invoked protocols stopped by the end of round r + 1 will apply 
monitorJialting in that round. Those that will notice that all previously invoked protocols, except the one starting in 
round r + 1, have stopped, will apply monitorJialting. The same arguments as for the case of using rule H^, the 
decision value is identical at all correct processes. 

By the end of round r + 2, all other correct processes, that did not already apply monitorJialting, will either 
observe that all previously invoked protocols have stopped and will apply monitorJialting, or will observe that all 
previously invoked protocols except the one starting in round r + 1 have stopped and will have the condition that 
|{g I earlyq = true or q halted}| >t+l and will apply monitondecision by the end of round r + 2 and will halt by 
the end of round r + 3, thus potentially ignoring the output of the last protocol. Again, using previous arguments, all 
decision values are the same. □ 

Lemma 3 and 4 complete the correctness part of Theorem 1 . To simplify the polynomial considerations we look 
at a pipeline of monitors. 


4.3 Monitors Pipeline 

The basic monitor protocol runs a sequence of monitors and tests the number of faults’ threshold every 4 rounds 
(Line 5). This allows the adversary to expose more faults in the following round, and be able to further expand the tree 
before the threshold is noticed the next time the processes execute Line 5. To circumvent this we will run a pipeline of 
3 additional sequences of monitors on top of the basic one appearing above. Doing this we obtain that in every round 
r one of the 4 monitor sequences will be testing the threshold on the number of faults 

Monitor sequence i, for 1 < i < 4 begins in round i and invokes protocols every 4 rounds, in every round r, 
l<r = i + 4:k<t — 1, it invokes protocol Vt-i-Ak- Monitor sequence 1 is the basic monitor sequence defined 
in the previous subsection. Each monitor sequence independently runs the basic monitor protocol ( Figure 1) every 4 
rounds. In the monitor protocol, the test f = j, which stands for f = r (mod 4) in the basic monitor sequence, is 
replaced with fi = j, which stands for fi=r + l — i (mod 4) = j (naturally only for r + 1 — i > 0). Each of the 
four monitor sequences decides and halts separately, as in the previous section above. 

Notice that protocol Vt is invoked only by the basic sequence ( Sequence 1). For each of the three other monitor 
sequences, the decision rule is: decide BAD, if any invoked protocol (in this sequence) outputs BAD, and J_ otherwise. 
Observe that Lemma 3 and 4 hold for each individual sequence. 

We now state the global decision and global halting rules: 

Definition 2 (Global Halting). If any monitor sequence halts with BAD, or all 4 monitor sequences halt, the process 
halts. 

Definition 3. The globaljdecision is the output of 'Dt, unless any monitor sequence returns BAD, in which case the 
decision is BAD. 

The following are immediate consequences of Lemma 3 and 4 and the above definitions. 

Corollary 2. If n > 3t and there are f,f<t, corrupt processes then all correct processes halt by the end of round 
min(f + 1, / + 2). 

Corollary 3. If the first correct process halts in round r on d then every correct process applies globaLdecision by 
round min{r + 4, / + 2, f + 1}, halts by round min{r + 5, / + 2, f + 1}, and obtains the same decision value. 


5 Bounding the size of the tree 

Following the approach is [GM98], we make the following definitions: 

Definition 4. A node crz G H is fully corrupt if there does not exist p G G and a' □ az such that a' G IZTp^a'z\ + 2]. 
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Definition 5. A process z is becomes fully corrupt at i if exists a node az € "E that is fully corrupt, \az\ = i and for 
every previous node jc'zj < i, node a' z is not fully corrupt. 

The following is immediate from the definitions above. 

Claim 4. If process z becomes fully corrupt at i then of all the nodes ofT, that end with z only nodes of round i and 
i + 1 can be fully corrupt. 

Proof. By definition of fully corrupt, all correct processes will have z & F'm round z + 2. So in that round and later 
all nodes will put _L in TIT for z. □ 

Let CT, the corrupt tree, be a dynamic tree structure. CTis the tree of all fully corrupt nodes (note that due to 
coloring, the set of fully corrupt nodes is indeed a tree). We denote by C7\i] the state of CT at the end of round i. By 
the definition of fully corrupt, at round i we add nodes of length z — 2 to CT 

We label the nodes in CT as follows; a node az € CTis a regular node if process z becomes fully corrupt at \az\ 
and az € CTis a special node if process z becomes fully corrupt at |crz| — 1. 

Let cti denote the distinct number of processes that become fully corrupt at round z. For convenience, define 
czo = 0 (this technicality is useful in Lemma?). Let A = aQ,ai,... be the sequence of counts of process that 
become fully corrupt in a given execution. 

Followingtheapproachof[GM98], we define zuastei = —i. Sowastci is the number of processes that 

became fully corrupt till round z minus i (the round number). The following claim connects wastci to i^p^cTA\i + 3]p 
the set of fully detected corrupt processes at round z + 3. 

Claim 5. For any round 4 < r < f + 1, and any correct process we have |T4[r] | > X]j<r -3 

Proof. By the definition of z becoming fully corrupt at z, all correct processes will have z £ T in round z + 2. Due to 
the gossiping of T, all correct processes will have z £ TA in round z + 3. □ 

So if wastCi > 6 then in round ?■ = z + 3 we will have ~ * — 6 so by Lemma 5 for each correct 

process we have |T4[r]| > r + 3. In this case all correct processes will start in the associated monitor sequence the 
next protocol with initial value BAD and the protocol and monitor sequence and global protocol will reach agreement 
and halt on BAD by round z + 6 (by Lemma 1). 

We will now show that if the adversary maintains a small waste (less than 6 by the argument above, but this will 
work for any constant) then the CT tree must remain polynomial sized. 

The following key lemma shows that the adversary cannot increase the number of leaves by “cross contamination”. 
In more detail, if the adversary causes two fully corrupt processes at round zi followed by a sequence of rounds with 
exactly one fully corrupt process at each round followed by a round with no fully corrupt process at that round then 
this action essentially keeps the tree CT growing at a slow (polynomial) rate. We note that the focus on “cross 
contamination” follows the approach of [GM98]. But they only verify the case of two fully corrupt followed by a 
round with no fully corrupt. We have identified a larger family of adversary behavior that does not increase the waste 
(in the long run). Our proof covers this larger set of behaviors and this requires additional work. 

Lemma 5. Assume 0 < zi < Z 2 such that = 2, = 0 and for all zi < z < T, az = 1 then for any 

a £ n CT it is not the case that there exists apr £ ^£ + 1 n CT and there exists aqr n CT (so there is 

at most one extension). Moreover the size of the subtree starting from ap or aq and ending in length Z 2 + 1 is bounded 

byO{{i2 - iiY). 

See the additional analysis in Section 5.1. 

To bound the size of CT, we partition the sequence A = ag, ai,... by iteratively marking subsequences using the 
following procedure. For each subsequence we mark, we prove that it either causes the tree to grow in a controllable 
manner (so the ending tree is polynomial), or it causes the tree to grow considerably (by a factor of 0{n) ) but at the 
price of increasing the waste by some positive constant. Since the waste is bounded by a constant, the result follows. 
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1. By Lemma? we know that if A contains a 0(1)*0 (a sequence starting with 0 then some I’s then 0) then it 
contains it just once as a suffix of A. Moreover, this suffix does not increase the size of the tree by more than 
0(n). Let Ai be the resulting unmarked sequence after marking such a suffix (if it exists). 

2. Mark all subsequences in Ai of the form 2(1)*0 (a sequence starting with 2 then some I’s then 0). By Lemma 5 
each such occurrence will not increase the number of leafs in CT(but may add branches that will close whose 
total size is at most over all such sequences). Let A 2 be the remaining unmarked subsequences. 

3. Mark all subsequences in A 2 of the form 2f (1)*0 where X G {3,..., f} (a sequence starting with 3 or a larger 
number followed by some 1 ’s then 0). By Lemma 8 each occurrence of such a sequence may increase the size of 
the tree multiplicatively by 0 (n) leafs and 0 {n^) non-leaf nodes, but this also increases the waste by c — 1 > 1 
(where c is the first element of the subsequence). Observe that the remaining unmarked subsequences do not 
contain any element that equals 0. Let A 3 be the remaining unmarked subsequences. 

4. Mark all subsequences of the form Y (1)* where Y G {2,..., f} (a sequence whose first element is 2 or a larger 
number followed by some I’s but no zero at the end). Again, by Lemma 8 each such occurrence may increase 
the size of the tree by 0(n) leafs and O(n^) non-leafs, but this also increases the waste by c > 1. Let A 4 be 
the remaining unmarked. 

5. Since A 3 contains no element that equals zero and we removed all subsequences that have element of value 2 or 
larger as the first element then A 4 must either be empty or A 4 is a prefix of A of the form (1)* (a series of 1 ’s ). 
Since it is a prefix of A then a sequence of 1 ’s keeps at most one leaf. So the tree remains small. 

Thus, the size of CT is polynomial, which by Lemma 6 bounds the size of XT- This completes the proof of 
Theorem 1. 

5.1 Additional Analysis 

The following lemma bounds the size of XT as a function of the size of CT times O(n^). 

Lemma 6. If a G XT and \a\ > 7 then there exists a' \Z a with \a'\ > |cr| — 7 such that a' G CT 

Proof Seeking a contradiction let a = ct't be of minimal length such that a G IT, \a\ > 7, |t| = 7 and there does 
not exist a'r' G CT such that t' C t. 

Let w be the first element in r so a'w C a'r then since a'w ^ CT then by definition, some correct process will 
have cr'w G TZT[\( 7 'w\ + 2]. By Theorem 2 statement 4 all correct processes will have ct'w G 77.7~[|cr'|+ 5 and will close 
the branch a'w by round |cr'| +6 (see DECAY RULE) a contradiction to the assumption that cr G IT and |r| = 7. □ 

The following lemma shows that the protocol stops early if the adversary causes two rounds with no new fully 
corrupt and only one fully corrupt per round between them. 

Lemma 7. If exists 0 < < *2 such that = 0, = 0 and for all zi < z < Z 2 , cti = 1 then all processes will 

halt by the end of round 12 + 5. 

Proof The only fully corrupt process that can appear in round zi -f 1 is the new one from ctij+i = 1 (because = 0 
and a process can be as a node in CT for only two rounds starting from the first round it is fully corrupt). A simple 
induction shows that at round ii + j only the new fully corrupt node of round ii -f j can appear. Once we reach round 
Z 2 then no node can be fully corrupt so all branches will close and all processes will halt by the end of round Z 2 + 5. □ 

We now prove the main technical result of this section Lemma 5. It shows that having two fully corrupt then a 
series of one fully corrupt then a round with no fully corrupt does not increase the number of leafs in the tree. This 
can add some non-leaf nodes to the tree, but the overall addition of such nodes is bounded by a multiplicative factor 
of 0 {iT) over all such sequences. 

Proof of Lemma 5. Let processes p, g be the two that become fully corrupt at ii. We begin with the case that 12 = zi + 1 
such that there is no process that becomes fully corrupt at 12 - Consider any cr G CT where |cr| = ii — 1. The following 
is the subtree of cr G CT that we will analyze; 
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a 


p q 

I 

q p 

The following analysis for process p shows that either ap or aq or aqp will quickly be in TZT- Note that this 
implies that p, q can extend any node a € CT into at most one node of length 12 in CT- 

Let correctDetectorhe the set of correct processes that detect ap via the Not Voter detection rule in round |crp| + 1. 
Let correctVoter be the remaining correct processes (that are not in correctDetector). Note that by definition of Not 
Voter, the value of all those in correctVoter must be the same. Let d be this value. 

For each apu S E with u ^ q we have that apu ^ CT (because = 0). So apu € TiT[\apu\ + 2] for 
some correct processes and hence their value is fixed (otherwise ap G TZT and we are done) and all correct processes 
will have apu G TZT[\ap\ + 5]. Let faidtyEcho be the set of corrupt children of ap whose value is fixed to d. Let 
faultyEchoOtherhe the remaining corrupt process that are children of ap whose value is fixed to 7 ^ d. Note that ap has 
n — \ap\ children of which all but child apq must be fixed. Hence \faultyEcho \ + \faultyEchoOther\ > n — |crp| — 1. 

There are three cases to consider: 

Case 1: If \correctVoter \ + \faultyEcho\ > n — t — 1 then ap G TZT[\ap\ + 5] for all correct processes since all 
these n — t —1 children of ap will appear in TZT[\ap\ + 5] and so ap G TZT^ap\ + 5] using the RELAXED RULE. 

Otherwise, \correctVoter\ + \faultyEcho\ < n—t—2so\trmtsthet\\a.t\correctDetector\ + \faultyEchoOther\ > 
t + 1 — \ap\ =t + 2 — \apq\. This is because ap has n — \ap\ children and each one of them except of child q must 
fix their value in TlT[\ap\ + 3]. 

Case 2: If \correctDelector\ > t + 2 — \apq\ then SPECIAL-BOT RULE will fire on the level L + 1 node aqp. 
This will occur because all other children of aq are not fully corrupt - hence will appear in TZT[\aq\ + 5]. The only 
case in which SPECIAL-BOT RULE may not fire is if in the meantime aq G TZTin which case we are done. 

Case 3: It must be that correctDetector < t, hence correctVoter > f -f 1 on value d (because a contains no 
correct process). Since correctVoter > f -f 1 then all correct processes will see that ap is leaning towards d (see 
definitions 3. and 4. in the fault detection rules). 

For any w G faultyEchoOther, since w does not become fully corrupt at T or ii -f 1 it must be that are at least 
t + 1 correct processes that are children of apw that hear from apw a value d', d' f- d. So the conditions of Not 
Masking for apw hold. 

This implies that w is ‘forced’ to send _L for aqpw to all correct processes. For if w sends d' f J-to any correct 
processes for aqpw then by Not Masking rule at round \apw\ + 2 = \aqpw\ -I- 1 these correct processes will detect w 
as corrupt and in the same round mask aqpw to _L. 

Therefore there will be \correctDetector\ -f \faultyEchoOther\ > t + 2 — \aqp\ children of aqp that will appear 
in XT with value _L and since there is no process that becomes fully corrupt at T + 1 = T then all other children of 
aq must appear in TlT[\aq\ + 5]. So the SPECIAL-BOT RULE will fire on the level ii + l node aqp. This completes 
the proof for the case — *2 = 1 - 

We can now consider the case where T — T > 1- The key observation is that the above argument required two 
properties for a process z that becomes fully corrupt at round i. The first is that all the level i nodes of the form az 
have all their children (except one) fixed to some value. The second is that the level i + 1 nodes of the form a'z have 
the property that all other children of a' are fixed. 

Intuitively, if a child azu is fixed to the majority value of az then azu will help fix crz using the relaxed rule. 
Otherwise, azu is fixed to some d', which implies that at least t + \ correct processes received d' from azu. Hence 
azu must be a masker for the round i El node a' z. 

Next we observe the structure of CT given a sequence with ii —12 > 1- Let p, q be the two processes in ii, let i = 
*2 — *1 + 1 and denote hy X 3 ,... ,X(, the remaining fully corrupt by order of appearance. Using an inductive argument 
one can show that any CT graph will be a subgraph of the following: for every node a G CT of length ii — 1 there 
will be two branches that we call special branches. These branches will be apqxs .. .xg and aqpx^ .. .Xf,. Observe 
that these branches contain only special nodes. In addition, there will be regular branches as follows: apx^ .. .Xf,, 
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aqxs, ...Xi, apqxi ...xc, aqpx4,. ..xe, ... apqxs ... XiXi+2 ...xi, aqpxs ... XiXi+i. ..xe, ..., apqxs ... X(-2Xi, 
(jqpxs ... xi- 2 , xi. Observe that all these regular branches contain regular nodes and that all their children will be 
hxed due to round i 2 having no fully corrupt process. The number of regular branches is 0 {i2 — *i) and the length of 
each branch is bounded by 0 (i 2 — *i). 


p 
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^3 

1 

^3 

1 


13 *4 
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1 
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X4 
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Xi X3 

2^4 3^5 X5 
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X5 
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X5 3:5 3:4 

X5 



X5 

— ii = 4 . The two special branches are the rightmost and leftmost paths. All 


other leafs are the endpoints of all the regular branches. Observe that given one more fully corrupt, each special branch 
is split into two branches, one extends the original special branch and the other is a new regular branch that continues 
as a path. Also observe that one more fully corrupt will simply extend the path of each regular branch by one. 


As all the regular branches will have all their children hxed, they cannot be used as leafs to extend the tree. Since 
there are 0 (i 2 — ii) regular branches and each of them is of length at most 0 (i 2 — *i) then the total amount of nodes 
added in this process is 0((t2 — *i)^) per each leaf in CTof length ii — 2. So if the size of the tree without this subtree 
is X then the total number of non-tree nodes added by these types of sequences is at most 0 {xn^) (this is a crude 
bound that can be improved). 


We now need to show that at least one of the special branches gets hxed. Since all the regular branches cannot 
expand, our goal is to prove that it cannot be the case that both special branches are not hxed (in the 11 — 12 = 1 the 
analogue is that either apq or aqp is hxed). Given the key observation and the structure statement we can now apply 
a similar argument as we did for p in the ii — 12 = 1 case. We start with xi and going towards p, q. We will show that 
in each iteration on level i we either hx one of the special branches (and we are done) or we have sufficient conditions 
to use main argument on level * — 1. 

For the base case, consider xi. Because 12 = 0 then all the level ii + I — 2 nodes of the form a'xt, (for any 
a') have all their children hxed. So we can apply the main argument: if all these level ii + I — 2 nodes get hxed 
using the RELAXED RULE then all the regular branches ending with xt-i have all their children hxed and the two 
special branches ending ir-i each have their parent with xt-i as a only child. Therefore we continue by induction. 
Otherwise, by the argument above, all the level ii + I — 2 + 1 nodes of the form a'x^ (for any a') will be hxed 
SPECIAL-BOT RULE. In particular this includes the special branch. So we are done. 

For the general case, we assume that all level ii+ j — 2 nodes of the form a'xj (for any a') have all their children 
hxed and that for the two special branches, the parents of Xj have Xj as their only child. Again we can apply the 
ii — *2 = 1 arguments: If all these level ii -|- j — 2 node get hxed using the RELAXED RULE then we continue by 
induction to j — 1. Otherwise, by the argument above, all the level ii + j — 2 + 1 nodes of the form a'xt (for any 
a') will be hxed by the SPECIAL-BOT RULE. In particular this includes the special branch. So we are done since the 
special branch is hxed □ 


The following lemma shows that having a large number (3 or more) of processes becoming fully corrupt at a given 
round, followed by a sequence of 1 ’s and then maybe followed by 0 does increase the number of leafs considerably. 
Note that if + >6 then the monitor process will cause the protocol to reach agreement and stop in a constant 

number of rounds. So we only look at the case that < 6. 

Lemma 8. If 2 < -I- < 6, £ {Oj 1} for all ii < i < 12 , ai = 1 then for any a € (T CT 

there are at most 0 {i 2 — ii) nodes of the form ar n CT. Moreover the size of the subtree starting from a and 

ending in length T + f is bounded by 0 ((i 2 ~ 
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Proof. Using an overly pessimistic argument, every node cr G n CTcan have at most ai^-i ■ < 16 = 0(1) 

nodes of length zi + 2 in CT. Even if each such node is a special node then after 0(*2 — U) rounds of just one fully 
corrupt each round, each such node of length ii + 2 will generate at most 0 (i 2 — ii) regular branches, each is a path 
with at most 0(^2 — *i) nodes. 

□ 


6 Conclusion 

In this paper we resolve the problem of the existence of a protocol with polynomial complexity and optimal early 
stopping and resilience. The main remaining open question is reducing the complexity of such protocols to a low 
degree polynomial. Another interesting open problem is obtaining unbeatable protocols [CGM14] (which is a stronger 
notion than early stopping). 

We would like to thank Yoram Moses and Juan Garay for insightful discussions and comments. 
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